Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Foundation

Possibility of a git-based fully distributed Wikipedia

 

 

Wikipedia foundation RSS feed   Index | Next | Previous | View Threaded


mingli.yuan at gmail

Feb 18, 2008, 12:44 AM

Post #1 of 12 (9243 views)
Permalink
Possibility of a git-based fully distributed Wikipedia

Recently a git based wiki system git-wiki(
http://github.com/sr/git-wiki/tree/master ) was published on GitHub.
This triggered my thoughts on a fully distributed Wikipedia clone.

Git( http://en.wikipedia.org/wiki/Git_%28software%29 ) is a
distributed revision control project created by Linus Torvalds,
initially for the Linux kernel development. Just as mentioned by
apenwarr in the article "Git is the next Unix" (
http://www.advogato.org/person/apenwarr/diary/371.html ), git can be
used as a distributed platform potentially. He also pointed out a git
based wiki system.

By my understanding, one of the key points for these kind of possible
Wikipedia clone is the changing of current collaboration model.

Currently Wikipedia is based on a centered wiki site fully shared
(readable and writable) by everyone, a group of trusted users monitor
recent changes to avoid trolls and vandalism, and also conflict
resolving method is needed. As an user of Wikipedia projects, I know
the heavy load for some administrators to fight with the trolls and
vandalism, and full of conflicts in the community.

My friends, Isaac Mao, have pointed out: a layer of trustness is
missing in the design of current Wikipedia software system.

Then how about a fully distributed Wikipedia based on git?

Everybody have their own clone of the whole project, and maybe part of them.

A contributor pull/push changes from/to people they trusted.

Some hub will formed to receive changes by many ones, and will share
their revision to many ones.

A hub network will formed, and maybe the Foundation will own some
nodes of the network, and declare them as official nodes.

The hub network is based on the trustiness among different groups of people.

This is just my imagination. The relationship between the new
collaboration model and the core value of Wikimedia Foundation (
http://meta.wikimedia.org/wiki/Values )is still a question.

Thanks.

User:Mountain

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


gerard.meijssen at gmail

Feb 18, 2008, 1:03 AM

Post #2 of 12 (8999 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

Hoi,
The first thing to consider are the implications of a peer to peer based
system to distribute the content and the changes. These implications are
big. How you are to deal with vandalism in a peer network. How do you
distribute content, how do you deal with an increased need for a specific
article that came into demand.

Once you have a clue about these things, you can start thinking about
tooling. It may mean that GIT provides a basis, it may not. What I do know
is that the University of Amsterdam is researching this issue; how to
provide Wikipedia content and management in a peer to peer environment.
Thanks,
GerardM

On Feb 18, 2008 9:44 AM, mingli yuan <mingli.yuan [at] gmail> wrote:

> Recently a git based wiki system git-wiki(
> http://github.com/sr/git-wiki/tree/master ) was published on GitHub.
> This triggered my thoughts on a fully distributed Wikipedia clone.
>
> Git( http://en.wikipedia.org/wiki/Git_%28software%29 ) is a
> distributed revision control project created by Linus Torvalds,
> initially for the Linux kernel development. Just as mentioned by
> apenwarr in the article "Git is the next Unix" (
> http://www.advogato.org/person/apenwarr/diary/371.html ), git can be
> used as a distributed platform potentially. He also pointed out a git
> based wiki system.
>
> By my understanding, one of the key points for these kind of possible
> Wikipedia clone is the changing of current collaboration model.
>
> Currently Wikipedia is based on a centered wiki site fully shared
> (readable and writable) by everyone, a group of trusted users monitor
> recent changes to avoid trolls and vandalism, and also conflict
> resolving method is needed. As an user of Wikipedia projects, I know
> the heavy load for some administrators to fight with the trolls and
> vandalism, and full of conflicts in the community.
>
> My friends, Isaac Mao, have pointed out: a layer of trustness is
> missing in the design of current Wikipedia software system.
>
> Then how about a fully distributed Wikipedia based on git?
>
> Everybody have their own clone of the whole project, and maybe part of
> them.
>
> A contributor pull/push changes from/to people they trusted.
>
> Some hub will formed to receive changes by many ones, and will share
> their revision to many ones.
>
> A hub network will formed, and maybe the Foundation will own some
> nodes of the network, and declare them as official nodes.
>
> The hub network is based on the trustiness among different groups of
> people.
>
> This is just my imagination. The relationship between the new
> collaboration model and the core value of Wikimedia Foundation (
> http://meta.wikimedia.org/wiki/Values )is still a question.
>
> Thanks.
>
> User:Mountain
>
> _______________________________________________
> foundation-l mailing list
> foundation-l [at] lists
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


andreengels at gmail

Feb 18, 2008, 4:39 AM

Post #3 of 12 (8995 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

In a system like you envision, how are you going to deal with edit
conflict like situations? That is, what if I get one edit on a page
from one person, and another from another person, how am I going to
resolve the end state of my version of the wiki page?

2008/2/18, mingli yuan <mingli.yuan [at] gmail>:
> Recently a git based wiki system git-wiki(
> http://github.com/sr/git-wiki/tree/master ) was published on GitHub.
> This triggered my thoughts on a fully distributed Wikipedia clone.
>
> Git( http://en.wikipedia.org/wiki/Git_%28software%29 ) is a
> distributed revision control project created by Linus Torvalds,
> initially for the Linux kernel development. Just as mentioned by
> apenwarr in the article "Git is the next Unix" (
> http://www.advogato.org/person/apenwarr/diary/371.html ), git can be
> used as a distributed platform potentially. He also pointed out a git
> based wiki system.
>
> By my understanding, one of the key points for these kind of possible
> Wikipedia clone is the changing of current collaboration model.
>
> Currently Wikipedia is based on a centered wiki site fully shared
> (readable and writable) by everyone, a group of trusted users monitor
> recent changes to avoid trolls and vandalism, and also conflict
> resolving method is needed. As an user of Wikipedia projects, I know
> the heavy load for some administrators to fight with the trolls and
> vandalism, and full of conflicts in the community.
>
> My friends, Isaac Mao, have pointed out: a layer of trustness is
> missing in the design of current Wikipedia software system.
>
> Then how about a fully distributed Wikipedia based on git?
>
> Everybody have their own clone of the whole project, and maybe part of them.
>
> A contributor pull/push changes from/to people they trusted.
>
> Some hub will formed to receive changes by many ones, and will share
> their revision to many ones.
>
> A hub network will formed, and maybe the Foundation will own some
> nodes of the network, and declare them as official nodes.
>
> The hub network is based on the trustiness among different groups of people.
>
> This is just my imagination. The relationship between the new
> collaboration model and the core value of Wikimedia Foundation (
> http://meta.wikimedia.org/wiki/Values )is still a question.
>
> Thanks.
>
> User:Mountain
>
> _______________________________________________
> foundation-l mailing list
> foundation-l [at] lists
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>


--
Andre Engels, andreengels [at] gmail
ICQ: 6260644 -- Skype: a_engels

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


midom.lists at gmail

Feb 18, 2008, 9:42 AM

Post #4 of 12 (8984 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

Hi!

> Then how about a fully distributed Wikipedia based on git?

Apart from technical issues with it, one of major problems becomes
simply that the agility of content is lost.

There are lots of debates in software world about distributed model
(bazaar/mercurial/git/..) versus centralized one, and the major
difference is that in centralized model everyone is welcome to do
small incremental changes, thus participating in community development
and building more.

Once you go distributed, changes pushed are always far bigger in terms
of scope, and development length, and the major problem is 'merge
jam', where simply there's lacking manpower to merge contents and
resolve conflicts.

In software world it means QA and 'code control' departments doing that job.
In volunteer projects it becomes complicated, and merges end up
happening less than once a year, and that is on projects which have
far smaller change bandwidth.

While we do support forks as "keeping the knowledge free" strategy,
forks are not helpful for developing free knowledge.

Our model is just awesome for situations like Katrina, where we can
have lots of people building the best possible content in thousands of
revisions per day. Our conflict resolution is trivial, but it works in
many cases.

In the end, centralized versioning supports the 'be bold' far more -
changes are lightweight, easy to spot, easy to fix. Reviewing
10000-page changeset is something not that many people would want to
do, and it is something what needs very strict procedures.

We already had one special case, where distributed fork was an initial
idea for a project (Citizendium), but eventually it was ditched. It
just gets too complicated and unmanageable.

So, while distributed model works for 'teams', we're still a quite
monolith community, and 'divide et impera' isn't that needed at the
moment.

BR,
Domas

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


thomas.dalton at gmail

Feb 18, 2008, 9:56 AM

Post #5 of 12 (8995 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

> Everybody have their own clone of the whole project, and maybe part of them.

That's a lot of hard drive space! Is it practical to have a
distributed approach to a project as large as enwiki?

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


midom.lists at gmail

Feb 18, 2008, 10:05 AM

Post #6 of 12 (8989 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

Hello!

> That's a lot of hard drive space! Is it practical to have a
> distributed approach to a project as large as enwiki?

Terabytes are cheap nowadays, aren't they?

BR,
Domas

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


thomas.dalton at gmail

Feb 18, 2008, 11:05 AM

Post #7 of 12 (8992 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

On 18/02/2008, Domas Mituzas <midom.lists [at] gmail> wrote:
> Hello!
>
> > That's a lot of hard drive space! Is it practical to have a
> > distributed approach to a project as large as enwiki?
>
> Terabytes are cheap nowadays, aren't they?

By server standards, sure, not so much by desktop standards. There's
also the matter of downloading the whole thing - with a good broadband
connection, you're talking days, with a poor connection it's pretty
much impossible.

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


midom.lists at gmail

Feb 18, 2008, 11:09 AM

Post #8 of 12 (8993 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

Hi!

> By server standards, sure, not so much by desktop standards.

Actually, vice versa, nowadays external desktop drives are pushing the
terabyte costs to levels which make server ops drool.

Anyway, my email was more to signify the need for terabytes, than
actual cost of it :) Sorry if that wasn't too clear.

Domas

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


dgerard at gmail

Feb 18, 2008, 1:27 PM

Post #9 of 12 (8992 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

One thing that interests me about the idea of using git (or something
like it) as the backend is not so much the distributed aspect - though
if it makes it easier to take good backups, or to replicate copies
from the central git repo, that'll be a *major* improvement over the
present situation

But what catches my eye is the sophisticated attribution abilities of
git. Git regards the fundamental unit as the entire project; so if you
move a subroutine from one file to another, it will treat that as a
single change and the attribution will follow the lines moved.

Mapping this to a text wiki is of course not quite the same thing -
off the top of my head (1) what is the unit to work in? Paragraphs,
sentences, words? (2) How to detect cutting text from one article and
pasting it into another as a "move"? (2a) Cleaning up attribution by
hand afterwards? - but it's an interesting and I think useful idea.


- d.

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


rupert.thurner at wikimedia

Feb 18, 2008, 9:26 PM

Post #10 of 12 (8993 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

the bandwith and space for getting changes in is at one hand an
organisational matter (do i take en.wp as one git repository, or only
one page), and at the other a technical (does it support
branching/cloning parts of the whole).

technically there exist two strategies sticking out a little. first,
while all track files and changes onto it, git tracks only contents
and therefor notices better than others if a paragraph got moved from
one file to the other (or, one article to the other).

second, while all are able to track changes on a line and conflict the
change set if two people changed the same line, darcs is able to do
this even more detailled. it notices if two people changed two words
in a line and merges it.

there is minimum one research group (enst bretagne) which is
interested in p2p wiki technologies also trying to bid for
europe-government foundet project. they would be more than glad
finding people interested in that.

if you want to participate somehow, please write!

kr,

rupert.

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


emesee3 at gmail

Feb 23, 2008, 12:32 AM

Post #11 of 12 (8980 views)
Permalink
Possibility of a git-based fully distributed Wikipedia [In reply to]

also

http://meta.wikimedia.org/wiki/Proposals_for_new_projects#P2_Peerpedia

& along with that...

http://meta.wikimedia.org/wiki/P2P
_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


gwern0 at gmail

Feb 27, 2008, 5:14 PM

Post #12 of 12 (8932 views)
Permalink
Re: Possibility of a git-based fully distributed Wikipedia [In reply to]

On Sat, Feb 23, 2008 at 3:32 AM, mike <emesee3 [at] gmail> wrote:
> also
>
> http://meta.wikimedia.org/wiki/Proposals_for_new_projects#P2_Peerpedia
>
> & along with that...
>
> http://meta.wikimedia.org/wiki/P2P

I've since noted a very interesting little link: 'GitTorrent' at
http://gittorrent.utsl.gen.nz/rfc.html

The abstract:

"This document describes the GitTorrent Protocol version 0.1, referred
to as "GTP/0.1". The GitTorrent Protocol (GTP) is a protocol for
collaborative git repository distribution across the Internet. It is
best classified as a peer-to-peer (P2P) protocol, although it also
contains centralized elements.

Git is a decentralized version control system (VCS) created in the
beginning of 2005 by Linus Torvalds. To date only client-server based
distribution has been supported. Although git is already able to
densely exchange updates between repositories and thereby minimize the
overall resource requirements for distribution, this will occasionally
involve clients cloning a complete repository. This places much strain
on sites hosting many git repositories in terms of request-processing
and sheer bandwidth. It is the goal of GTP to facilitate such hosting
sites in reducing resource demands by using P2P distribution.

Normally a client does not use their upload capacity while downloading
a repository. The GTP approach capitalizes on this fact by having
clients upload bits of the repository data to each other. In
comparison to the original client-server distribution, this adds huge
scalability and cost-management advantages. People can set up
"mirrors" of torrents, and have them advertised with less
administration and more convenience to the user than setting up a
regular mirror."

--
gwern

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Wikipedia foundation RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.