Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

memento: time warp for mediawiki

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


daniel at brightbyte

Nov 12, 2009, 5:13 AM

Post #1 of 23 (2304 views)
Permalink
memento: time warp for mediawiki

Hi all

The Memento Project <http://www.mementoweb.org/> (including the Los Alamos
National Laboratory (!) featuring Herbert Van de Sompel of OpenURL fame) is
proposing a new HTTP header, X-Accept-Datetime, to fetch old versions of a web
resource. They already wrote a MediaWiki extension for this
<http://www.mediawiki.org/wiki/Extension:Memento> - which would of course be
particularly interesting for use on Wikipedia.

Do you think we could have this for Wikimedia project? I think that would be
very nice indeed. I recall that ways to look at last weeks main page have been
discussed before, and I see several issues:

* the timestamp isn't a unique identifier, multiple revisions *might* have the
same timestamp. We need a tiebreak (rev_id would be the obvious choice).
* templates and images also need to be "time warped". It seems like the
extension does not address this at the moment. For flagged revisions we do have
such a machnism, right? Could that be used here?
* Squids would need to know about the new header, and by pass the cache when
it's used.

so, what do you think? what does it take? Can we point them to the missing bits?

-- daniel

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


mdale at wikimedia

Nov 12, 2009, 6:23 AM

Post #2 of 23 (2260 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

Instead of witting it as an extra header to HTTP protocol ... why don't
they write it as a proxy to wikimedia (or any other site the want to
temporal proxy). Getting a new HTTP header out there is not an easy task
at best a small percentage of sites will support it and then you need to
deploy clients and write user interfaces that support it as well.

If viewing old version of sites is something interesting to them. It
probably best to write a interface a firefox extension or grease monkey
script that integrates makes a "temporal" interface of their likening
for the mediawiki api (presumably the "history button" fails to
represent their vision? )... for non-mediawiki sites could access "the
way back machine".

If the purpose is to support searching or archival. Then its probably
best to proxy the mediaWiki api through a proxy that they setup that
supports those temporal requests across all sites (ie an enhanced
interface to the wayback machine?)

--michael

Daniel Kinzler wrote:
> Hi all
>
> The Memento Project <http://www.mementoweb.org/> (including the Los Alamos
> National Laboratory (!) featuring Herbert Van de Sompel of OpenURL fame) is
> proposing a new HTTP header, X-Accept-Datetime, to fetch old versions of a web
> resource. They already wrote a MediaWiki extension for this
> <http://www.mediawiki.org/wiki/Extension:Memento> - which would of course be
> particularly interesting for use on Wikipedia.
>
> Do you think we could have this for Wikimedia project? I think that would be
> very nice indeed. I recall that ways to look at last weeks main page have been
> discussed before, and I see several issues:
>
> * the timestamp isn't a unique identifier, multiple revisions *might* have the
> same timestamp. We need a tiebreak (rev_id would be the obvious choice).
> * templates and images also need to be "time warped". It seems like the
> extension does not address this at the moment. For flagged revisions we do have
> such a machnism, right? Could that be used here?
> * Squids would need to know about the new header, and by pass the cache when
> it's used.
>
> so, what do you think? what does it take? Can we point them to the missing bits?
>
> -- daniel
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


smolensk at eunet

Nov 12, 2009, 7:43 AM

Post #3 of 23 (2261 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

Daniel Kinzler wrote:
> The Memento Project <http://www.mementoweb.org/> (including the Los Alamos
> National Laboratory (!) featuring Herbert Van de Sompel of OpenURL fame) is
> proposing a new HTTP header, X-Accept-Datetime, to fetch old versions of a web
> resource. They already wrote a MediaWiki extension for this
> <http://www.mediawiki.org/wiki/Extension:Memento> - which would of course be
> particularly interesting for use on Wikipedia.
>
> Do you think we could have this for Wikimedia project? I think that would be
> very nice indeed. I recall that ways to look at last weeks main page have been
> discussed before, and I see several issues:
>
> * the timestamp isn't a unique identifier, multiple revisions *might* have the
> same timestamp. We need a tiebreak (rev_id would be the obvious choice).

I'd say it is, if sufficiently precise :) If not, either use the
lowest/highest rev_id, or the user could be asked to choose a version.

> * templates and images also need to be "time warped". It seems like the
> extension does not address this at the moment. For flagged revisions we do have
> such a machnism, right? Could that be used here?

I see three independent things here:

1) When viewing a past version of a page, show appropriate templates,
images, magic words etc.

2) When viewing a past version of a page, link to other pages as
appropriate (show red links if they haven't yet existed, link to their
appropriate past version if they have). I'd say this is the easiest to
implement, and the most interesting for readers.

3) Ability to view a page as it looked at a certain time (as opposed to
a certain revision).

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Nov 12, 2009, 7:52 AM

Post #4 of 23 (2281 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On Thu, Nov 12, 2009 at 10:43 AM, Nikola Smolenski <smolensk [at] eunet> wrote:
> I'd say it is, if sufficiently precise :)

MediaWiki only keeps timestamps to one-second precision, so it's not.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hvdsomp at gmail

Nov 12, 2009, 12:55 PM

Post #5 of 23 (2260 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

hi Daniel, all,

Jakob Voss informed me that there had been a posting regarding memento
on this list. Since we are very keen on native Memento support for the
mediawiki platform, I felt like responding by giving some perspective
on Memento in general, and on some questions that were raised
regarding its implementation for mediawiki, specifically:

1. As with previous projects we have engaged in (OpenURL, OAI-PMH, OAI-
ORE, SRU/W), Memento is not merely an academic exercise about which we
want to publish papers. Given our status as researchers, we do need
to publish the occasional paper but the real goal of the project is to
make datetime content negotiation for the Web really happen. Doing so
will require a lot more work, at various levels, including:

a. formal specification (we are thinking an Internet Draft => RFC path),
b. promotion,
c. real life implementations,
d. further research.

Currently, we are focusing on (b) and (c) to try and overcome a
chicken and egg situation: we think we propose a nice framework to
integrate archival content seamlessly in regular web navigation, but
in order to demonstrate it we need some adoption. But in order to get
some adoption we need to be able to demonstrate the framework, which
we can't do without adoption. Etc. ;-)

It is in this context that our contact with Jakob Voss, and Daniel's
mail on this list, is really exciting. The ability to demonstrate
Memento at work for mediawiki platform, including the Wikipedia
deployments, would be absolutely fantastic for the Memento cause. That
is why we have immediately engaged in further development of our
initial prototype Memento plug-in for mediawiki, to take into account
remarks made by Jakob, and to make it more robust. The ongoing work is
at http://www.mediawiki.org/wiki/Extension:Memento .

2. Let me describe the actual status and challenges faced in the
Memento plug-in work:

2.1. The plug-in detects a client's X-Accept-Datetime header, and
returns the mediawiki page that was active at the datetime specified
in the header. Same for images, actually. This effectively allows
navigating (as in clicking links) a mediawiki collection as it existed
in the past: as long as a client issues an X-Accept-Datetime header,
matching history pages/images will be retrieved.

2.2. We are looking into addressing this issue raised by Jakob (and
Daniel): Display history pages with the template that was active at
the time the history page acted as the current one. We definitely
think this would be cool, but we don't think it can be achieved by our
plug-in because templates are included at the server side, i.e. they
are not URI-addressed XSL that are rendered at the client-side. Hence,
one can't do datetime content negotiation on them - they are outside
of the memento realm and rather in the realm of the CMS. So, we are
looking at the mediawiki code to see whether a history page, when
rendered, could itself retrieve the appropriate (old) template from
the database. If we are successful, we will share that code also at http://www.mediawiki.org/wiki/Extension:Memento
once available. It will obviously be up to the mediawiki community
whether they are willing to adopt the proposed change to the codebase.

2.3. We have looked into another issue raised by Jakob: Display
deleted pages as they existed at the datetime expressed in X-Datetime-
Accept. We have actually implemented this. There are 2 caveats:
- as is the case with mediawiki in general, deleted pages are only
accessible by those with appropriate permissions;
- as is the case with mediawiki in general, deleted pages show up in
Edit mode.
This code will soon be included at http://www.mediawiki.org/wiki/Extension:Memento
.

2.4. We do not feel that all pages should necessarily be subject to
datetime content negotiation, in the same way that not all URIs are
subject to content negotiation in other dimensions. We feel that the
Special Pages fall under this category, as they do not have History.

2.5. We have ideas regarding how to address the issue raised by
Daniel: the timestamp isn't a unique identifier, multiple revisions
*might* have the
same timestamp. From the perspective of Memento, a datetime is
obviously the only "globally" recognizable value that can be used for
negotiation. If cases occur where multiple versions of a page exist
for the same second, the thing to do according to RFC 2295 would be to
return a "300 Mutliple Choices", listing the URIs (and metadata) of
those version in an Alternates header. The client then has to take it
from there.

2.6. The caching issue is a general problem arising from introducing
Memento in a web that does not (yet) do Memento: when in datetime
content negotiation mode all caches between client and server (both
included) need to be bypassed. As described in our paper, we currently
address this problem by adding the following client headers:

Cache-Control: no-cache => to force cache revalidation, and
If-Modified-Since: Thu, 01 Jan 1970 00:00:00 GMT' to enforce
validation failure

We very much understand this is not elegant but it tends to work ;-) .
This is an area for further research. As the paper states: "Ideally,
a solution should leverage existing caching practice but extend it in
such a way that caches are only bypassed in DT-conneg when essential,
but still used whenever possible (e.g., to deliver Mementos)."

I hope this helps. Please let us know what we can do to increase the
chances of adoption of the Memento solution for the mediawiki
platform. I hope it is clear that we _really_ would like to see this
happen!

Cheers

Herbert Van de Sompel

==

Hi all

The Memento Project <http://www.mementoweb.org/> (including the Los
Alamos
National Laboratory (!) featuring Herbert Van de Sompel of OpenURL
fame) is
proposing a new HTTP header, X-Accept-Datetime, to fetch old versions
of a web
resource. They already wrote a MediaWiki extension for this
<http://www.mediawiki.org/wiki/Extension:Memento> - which would of
course be
particularly interesting for use on Wikipedia.

Do you think we could have this for Wikimedia project? I think that
would be
very nice indeed. I recall that ways to look at last weeks main page
have been
discussed before, and I see several issues:

* the timestamp isn't a unique identifier, multiple revisions *might*
have the
same timestamp. We need a tiebreak (rev_id would be the obvious choice).
* templates and images also need to be "time warped". It seems like the
extension does not address this at the moment. For flagged revisions
we do have
such a machnism, right? Could that be used here?
* Squids would need to know about the new header, and by pass the
cache when
it's used.

so, what do you think? what does it take? Can we point them to the
missing bits?

-- daniel

==
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/
tel. +1 505 667 1267





_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


azaroth42 at gmail

Nov 12, 2009, 1:28 PM

Post #6 of 23 (2262 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

Hi Michael and all,

The first thing which we implemented was exactly this idea of a proxy using
the wikipedia API.

The proxy is here:
http://mementoproxy.lanl.gov/wiki/timegate/(wikipedia URI)

For example:

http://mementoproxy.lanl.gov/wiki/timegate/http://en.wikipedia.org/wiki/Clock

We have also implemented proxies for the Internet Archive, Archive-It,
WebCitation.org and several others, as proof-of-concept pieces for the
research.

There are several reasons why a native implementation is better for all
concerned:

1. The browser somehow needs to know where the proxy is, rather than being
natively redirected to the correct page. For a few websites, and a few
proxies, this is tolerable. However even one proxy per CMS would be an
impossible burden to maintain, let alone one proxy per website!

2. If the website redirected to the proxy, rather than the client knowing
where to go, then this would be on trust that the proxy behaved correctly.
In a native implementation, you're never redirected off-site.

3. The proxy will redirect back to the appropriate history page, however
this page doesn't know that it's being treated as a Memento, and will not
issue the X-Datetime-Validity or X-Archive-Interval headers. This makes it
difficult (but not impossible) for the client to trap that it has been
redirected correctly.

4. The offsite redirection adds at least 2 extra HTTP transactions per
resource, slowing down the retrieval. In the native implementation the main
page redirects to the history page directly. In the proxy, the browser goes
to the main page, then either knows of or is redirected to the proxy, the
proxy makes one or more API calls to fetch the history for the page to
calculate the right revision, and then redirects the client back there.

5. We don't have to maintain the proxies :)


So for wikimedia installations the native approach is better as it's trusted
and faster and involves less API calls. For the client it's better as it's
faster and doesn't require intelligence or a list of proxies. For the proxy
maintainer it's better as they're no longer needed.

I hope that helps clarify things,

Rob Sanderson
(Also at Los Alamos with Herbert Van de Sompel)


Michael Dale wrote:

Instead of witting it as an extra header to HTTP protocol ... why don't
they write it as a proxy to wikimedia (or any other site the want to
temporal proxy). Getting a new HTTP header out there is not an easy task
at best a small percentage of sites will support it and then you need to
deploy clients and write user interfaces that support it as well.

If viewing old version of sites is something interesting to them. It
probably best to write a interface a firefox extension or grease monkey
script that integrates makes a "temporal" interface of their likening
for the mediawiki api (presumably the "history button" fails to
represent their vision? )... for non-mediawiki sites could access "the
way back machine".

If the purpose is to support searching or archival. Then its probably
best to proxy the mediaWiki api through a proxy that they setup that
supports those temporal requests across all sites (ie an enhanced
interface to the wayback machine?)

--michael
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Platonides at gmail

Nov 12, 2009, 2:19 PM

Post #7 of 23 (2260 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

Hello Herbert.

Herbert Van de Sompel wrote:
> 2. Let me describe the actual status and challenges faced in the
> Memento plug-in work:
>
> 2.1. The plug-in detects a client's X-Accept-Datetime header, and
> returns the mediawiki page that was active at the datetime specified
> in the header. Same for images, actually.

> 2.2. Display history pages with the template that was active at
> the time the history page acted as the current one. [Snip] So, we are
> looking at the mediawiki code to see whether a history page, when
> rendered, could itself retrieve the appropriate (old) template from
> the database. If we are successful, we will share that code also at http://www.mediawiki.org/wiki/Extension:Memento
> once available. It will obviously be up to the mediawiki community
> whether they are willing to adopt the proposed change to the codebase.

Obviously it's a server issue.


> 2.3. We have looked into another issue raised by Jakob: Display
> deleted pages as they existed at the datetime expressed in X-Datetime-
> Accept. We have actually implemented this. There are 2 caveats:
> - as is the case with mediawiki in general, deleted pages are only
> accessible by those with appropriate permissions;
> - as is the case with mediawiki in general, deleted pages show up in
> Edit mode.
> This code will soon be included at http://www.mediawiki.org/wiki/Extension:Memento

Showing deleted pages in edit mode is not always the case, since they
can't be rendered (albeit not with the old templates, which would be an
interesting enhacement by your work).


It is impressive how far you have gone. However, I don't think you can
do a *complete* implementation.

First, you should be aware that timemachining the pages has been tried
in the past. Discussions treating FlaggedReves are also relevant for
your project.
FlaggedRevs is an extension which allow to mark the status of a page
(eg. not vandalised) at a point in time. A naive implementation would
store the timestamp and get the old version from the archive. They ended
up storing in a table specific to the extension the page content with
templates transcluded.
However, flaggedrevs is a tool to fight vandalism. Yours is an archival
one. You could accept imperfect results under certain circunstances.


Problematic aspects:

Page moves/image moves:
*You want to see content of Foo at epoch, but the history now at Foo is
wrong. Instead you need to look at that history of the page now at
Foo_(disambiguation)
You need to follow (perhaps even many times) the move logs to find out
the real page.

Page merges:
*When two pages have been merged, you will want to show the revision
which was originally at the page the user wants to timemachine. You can
no longer just rely on the timestamps. You may be able to get that by
splitting the sources at the merge time and going back via
rev_parent_id. Needless to say, this is very inefficient, this piece
wouldn't be put live at wikipedia.

Partial undeletions:
*When a page is undeleted, the summary shows how many revisions were
undeleted, but not *which* ones.

Case:
*Page A has two edits (#1 and #2).
*A vandal adds obscene content to it (#3).
*Admin deletes the page and restores the two first revisions.
*Several months later, the page is completely deleted.

When an admin wants to view what the page looked like those months, an
application is unable to determine if the two revisions which had been
shown were #1 and #2 or perhaps #2 and #3.


revdelete may have similar issues.



> 2.4. We do not feel that all pages should necessarily be subject to
> datetime content negotiation, in the same way that not all URIs are
> subject to content negotiation in other dimensions. We feel that the
> Special Pages fall under this category, as they do not have History.
>
> 2.5. We have ideas regarding how to address the issue raised by
> Daniel: the timestamp isn't a unique identifier, multiple revisions
> *might* have the
> same timestamp. From the perspective of Memento, a datetime is
> obviously the only "globally" recognizable value that can be used for
> negotiation. If cases occur where multiple versions of a page exist
> for the same second, the thing to do according to RFC 2295 would be to
> return a "300 Mutliple Choices", listing the URIs (and metadata) of
> those version in an Alternates header. The client then has to take it
> from there.


> 2.6. The caching issue is a general problem arising from introducing
> Memento in a web that does not (yet) do Memento: when in datetime
> content negotiation mode all caches between client and server (both
> included) need to be bypassed. As described in our paper, we currently
> address this problem by adding the following client headers:
>
> Cache-Control: no-cache => to force cache revalidation, and
> If-Modified-Since: Thu, 01 Jan 1970 00:00:00 GMT' to enforce
> validation failure
>
> We very much understand this is not elegant but it tends to work ;-) .


The caching issue is IMHO the bigger problem in your approach using the
new header.
Disabling cache on the request kind of work (although not in the long
term), but you also need to disable caching at the server, so when
someone accessing by your same proxy (ignorant of X-Accept-Datetime) to
the current page doesn't get the cached page you were served earlier.

RFC 2145 states very clearly that "A proxy MUST forward an unknown
header", but in your case it'd have been preferable that the header
wasn't forwarded if the proxy isn't memento aware.

Which leads us to another issue, which is that it seems your server
implementation doesn't "acknowledge" memento, so given a response to a
X-Accept-Datetime, you don't know if what you're getting is the version
you requested or the current one (because the server ignored it).
It can be as simple as requiring a Last-Modified <= X-Accept-Datetime on
Accept-Datetime responses (that would allow the server to explicitely
tell since when is it valid), but extended to all response codes.


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hvdsomp at gmail

Nov 12, 2009, 3:09 PM

Post #8 of 23 (2261 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On Nov 12, 2009, at 3:19 PM, Platonides wrote:
>
>> 2.3. We have looked into another issue raised by Jakob: Display
>> deleted pages as they existed at the datetime expressed in X-
>> Datetime-
>> Accept. We have actually implemented this. There are 2 caveats:
>> - as is the case with mediawiki in general, deleted pages are only
>> accessible by those with appropriate permissions;
>> - as is the case with mediawiki in general, deleted pages show up in
>> Edit mode.
>> This code will soon be included at http://www.mediawiki.org/wiki/Extension:Memento
>
> Showing deleted pages in edit mode is not always the case, since they
> can't be rendered (albeit not with the old templates, which would be
> an
> interesting enhacement by your work).
>
>
> It is impressive how far you have gone. However, I don't think you can
> do a *complete* implementation.
>
> First, you should be aware that timemachining the pages has been tried
> in the past. Discussions treating FlaggedReves are also relevant for
> your project.
> FlaggedRevs is an extension which allow to mark the status of a page
> (eg. not vandalised) at a point in time. A naive implementation would
> store the timestamp and get the old version from the archive. They
> ended
> up storing in a table specific to the extension the page content with
> templates transcluded.
> However, flaggedrevs is a tool to fight vandalism. Yours is an
> archival
> one. You could accept imperfect results under certain circunstances.


Indeed, it suffices to look at the Internet Archive and comparable web
archives to see that one needs to live with what is reasonably
achievable, not with what one would love to have. Imperfection is
allowed when looking at this problem from an archival perspective.

Related to this, one must be careful not to cross the border between:

(a) what can purely be achieved using the primitives of the web
architecture (URI, resource, representation), and HTTP, with datetime
content negotiation added to the mix;
(b) what is in the realm of content, interpretation, etc.

Let me explain what I mean: Wikipedia used to have a page for "Alito".
The page got discontinued and in its place came a page "Samuel Alito".
Both have their separate URIs, and so for each individually datetime
content negotiation will work nicely. That is what I mean with (a)
above. However, connecting "Alito" and "Samuel Alito" moves us into
the realm of (b). Things could be done in this specific type of case,
as redirects are in place between the Alito and Samuel Alito URIs
(unfortunately not the 304 or 302 one would expect but rather a 200)
meaning such redirection info is in the database. Hence it could be
acted upon. And, so we could explore this, although I feel this gets
us into the (b) zone. Again, generally speaking we must remain aware
of the line between (a) and (b) above. A

Cheers

herbert


==
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/
tel. +1 505 667 1267





_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hvdsomp at gmail

Nov 12, 2009, 3:20 PM

Post #9 of 23 (2262 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On Nov 12, 2009, at 3:19 PM, Platonides wrote:
>
>> 2.6. The caching issue is a general problem arising from introducing
>> Memento in a web that does not (yet) do Memento: when in datetime
>> content negotiation mode all caches between client and server (both
>> included) need to be bypassed. As described in our paper, we
>> currently
>> address this problem by adding the following client headers:
>>
>> Cache-Control: no-cache => to force cache revalidation, and
>> If-Modified-Since: Thu, 01 Jan 1970 00:00:00 GMT' to enforce
>> validation failure
>>
>> We very much understand this is not elegant but it tends to
>> work ;-) .
>
>
> The caching issue is IMHO the bigger problem in your approach using
> the
> new header.
> Disabling cache on the request kind of work (although not in the long
> term), but you also need to disable caching at the server, so when
> someone accessing by your same proxy (ignorant of X-Accept-Datetime)
> to
> the current page doesn't get the cached page you were served earlier.

Agreed, of course, that our current cache fix is a temp solution.

Not sure what you mean by the above remark, but it is totally fine to
cache the current page in mediawiki because the history pages are not
served from the URI of the current page, neither by our plug-in nor in
Memento in general (see http://www.mementoweb.org/guide/http/local/).
Rather, a X-Datetime-Accept request is redirected (302 Found) to an
appropriate history resource that has its own URI (with title and
oldid in case of mediawiki). And, hence, even those history pages can
be cached by mediawiki equipped with the memento plug-in.

> RFC 2145 states very clearly that "A proxy MUST forward an unknown
> header", but in your case it'd have been preferable that the header
> wasn't forwarded if the proxy isn't memento aware.
>
> Which leads us to another issue, which is that it seems your server
> implementation doesn't "acknowledge" memento, so given a response to a
> X-Accept-Datetime, you don't know if what you're getting is the
> version
> you requested or the current one (because the server ignored it).
> It can be as simple as requiring a Last-Modified <= X-Accept-
> Datetime on
> Accept-Datetime responses (that would allow the server to explicitely
> tell since when is it valid), but extended to all response codes.
>


Actually, have a look at http://www.mementoweb.org/guide/http/local/ .
You will note that the following response header is always included:

X-Archive-Interval: {datetime_start} - {datetime_end}

This allows a client to understand he received a history resource. The
values to use are the start datetime and end datetime for which the
server has representations for the the URI at hand.

Our plug-in implements this for mediawiki. Our proxy can't do this.

Cheers

herbert


==
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/
tel. +1 505 667 1267





_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hariharshankar at gmail

Nov 12, 2009, 3:38 PM

Post #10 of 23 (2261 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

We have made some updates to the Memento extension and we have also
written a fix to perform datetime content negotiation on transcluded
templates. Details can be found in the wiki page for the extension
http://www.mediawiki.org/wiki/Extension:Memento .
Harihar
(Los Alamos National Labs)

Herbert Van de Sompel wrote:
> On Nov 12, 2009, at 3:19 PM, Platonides wrote:
>
>>> 2.6. The caching issue is a general problem arising from introducing
>>> Memento in a web that does not (yet) do Memento: when in datetime
>>> content negotiation mode all caches between client and server (both
>>> included) need to be bypassed. As described in our paper, we
>>> currently
>>> address this problem by adding the following client headers:
>>>
>>> Cache-Control: no-cache => to force cache revalidation, and
>>> If-Modified-Since: Thu, 01 Jan 1970 00:00:00 GMT' to enforce
>>> validation failure
>>>
>>> We very much understand this is not elegant but it tends to
>>> work ;-) .
>>>
>> The caching issue is IMHO the bigger problem in your approach using
>> the
>> new header.
>> Disabling cache on the request kind of work (although not in the long
>> term), but you also need to disable caching at the server, so when
>> someone accessing by your same proxy (ignorant of X-Accept-Datetime)
>> to
>> the current page doesn't get the cached page you were served earlier.
>>
>
> Agreed, of course, that our current cache fix is a temp solution.
>
> Not sure what you mean by the above remark, but it is totally fine to
> cache the current page in mediawiki because the history pages are not
> served from the URI of the current page, neither by our plug-in nor in
> Memento in general (see http://www.mementoweb.org/guide/http/local/).
> Rather, a X-Datetime-Accept request is redirected (302 Found) to an
> appropriate history resource that has its own URI (with title and
> oldid in case of mediawiki). And, hence, even those history pages can
> be cached by mediawiki equipped with the memento plug-in.
>
>
>> RFC 2145 states very clearly that "A proxy MUST forward an unknown
>> header", but in your case it'd have been preferable that the header
>> wasn't forwarded if the proxy isn't memento aware.
>>
>> Which leads us to another issue, which is that it seems your server
>> implementation doesn't "acknowledge" memento, so given a response to a
>> X-Accept-Datetime, you don't know if what you're getting is the
>> version
>> you requested or the current one (because the server ignored it).
>> It can be as simple as requiring a Last-Modified <= X-Accept-
>> Datetime on
>> Accept-Datetime responses (that would allow the server to explicitely
>> tell since when is it valid), but extended to all response codes.
>>
>>
>
>
> Actually, have a look at http://www.mementoweb.org/guide/http/local/ .
> You will note that the following response header is always included:
>
> X-Archive-Interval: {datetime_start} - {datetime_end}
>
> This allows a client to understand he received a history resource. The
> values to use are the start datetime and end datetime for which the
> server has representations for the the URI at hand.
>
> Our plug-in implements this for mediawiki. Our proxy can't do this.
>
> Cheers
>
> herbert
>
>
> ==
> Herbert Van de Sompel
> Digital Library Research & Prototyping
> Los Alamos National Laboratory, Research Library
> http://public.lanl.gov/herbertv/
> tel. +1 505 667 1267
>
>
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


tstarling at wikimedia

Nov 12, 2009, 4:15 PM

Post #11 of 23 (2254 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

Daniel Kinzler wrote:
> Hi all
>
> The Memento Project <http://www.mementoweb.org/> (including the Los Alamos
> National Laboratory (!) featuring Herbert Van de Sompel of OpenURL fame) is
> proposing a new HTTP header, X-Accept-Datetime, to fetch old versions of a web
> resource. They already wrote a MediaWiki extension for this
> <http://www.mediawiki.org/wiki/Extension:Memento> - which would of course be
> particularly interesting for use on Wikipedia.
>
> Do you think we could have this for Wikimedia project? I think that would be
> very nice indeed. I recall that ways to look at last weeks main page have been
> discussed before, and I see several issues:
>
> * the timestamp isn't a unique identifier, multiple revisions *might* have the
> same timestamp. We need a tiebreak (rev_id would be the obvious choice).
> * templates and images also need to be "time warped". It seems like the
> extension does not address this at the moment. For flagged revisions we do have
> such a machnism, right? Could that be used here?
> * Squids would need to know about the new header, and by pass the cache when
> it's used.

You can't view the main page as it was in the past, because users
routinely upload temporary images to display there, so that they can
be protected, and then delete them once they're off the page.

Also, we can't have people crawling Wikipedia while requesting old
versions, because of the excessive disk seeking and CPU usage that
would generate. That's why the history page has a robot policy of
noindex, nofollow.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


azaroth42 at gmail

Nov 12, 2009, 4:33 PM

Post #12 of 23 (2252 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

Hi Tim,

If there's a problem with viewing past versions of the main page, that's
perfectly okay -- it can be excluded from the resources that are datetime
content negotiable like the Special: pages.

I admit to not following the second issue completely. A regular robot would
never issue the X-Accept-Datetime to jump back in time, so that's okay. A
regular robot would also respect the history page policy and not crawl
backwards either, as you say. A robot that did issue X-Accept-Datetime
would end up crawling old revision pages and never hit a history list, but
this could also be forbidden via robots.txt if the revision pages were
excluded too?

However, that seems like it's a long time off before people write past-web
crawlers and the use case for even doing it at all is pretty hard to come up
with. :)

Hope this addresses your concerns!

Rob

On Thu, Nov 12, 2009 at 5:15 PM, Tim Starling <tstarling [at] wikimedia>wrote:

> Daniel Kinzler wrote:
> > Hi all
> >
> > The Memento Project <http://www.mementoweb.org/> (including the Los
> Alamos
> > National Laboratory (!) featuring Herbert Van de Sompel of OpenURL fame)
> is
> > proposing a new HTTP header, X-Accept-Datetime, to fetch old versions of
> a web
> > resource. They already wrote a MediaWiki extension for this
> > <http://www.mediawiki.org/wiki/Extension:Memento> - which would of
> course be
> > particularly interesting for use on Wikipedia.
> >
> > Do you think we could have this for Wikimedia project? I think that would
> be
> > very nice indeed. I recall that ways to look at last weeks main page have
> been
> > discussed before, and I see several issues:
> >
> You can't view the main page as it was in the past, because users
> routinely upload temporary images to display there, so that they can
> be protected, and then delete them once they're off the page.
>
> Also, we can't have people crawling Wikipedia while requesting old
> versions, because of the excessive disk seeking and CPU usage that
> would generate. That's why the history page has a robot policy of
> noindex, nofollow.
>
> -- Tim Starling
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Nov 12, 2009, 6:25 PM

Post #13 of 23 (2246 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On Thu, Nov 12, 2009 at 3:55 PM, Herbert Van de Sompel
<hvdsomp [at] gmail> wrote:
> 2.1. The plug-in detects a client's X-Accept-Datetime header, and
> returns the mediawiki page that was active at the datetime specified
> in the header. Same for images, actually. This effectively allows
> navigating (as in clicking links) a mediawiki collection as it existed
> in the past: as long as a client issues an X-Accept-Datetime header,
> matching history pages/images will be retrieved.

Doesn't the use of a header here violate the idea of each URL
representing only one resource? The server will be returning totally
different things for a GET to the same URL. That seems like it would
cause all sorts of problems -- not only do caching proxies break
(which I'd think by itself makes the feature unusable for users behind
caching proxies), but how do you deal with things like bookmarking, or
sending a link to a particular version of the page to someone? These
would become impossible, unless the server goes to the extra effort to
return a redirect.

It seems to me like a better path would be to have different URLs for
different dates. The obvious way to do this would be to take an
approach like OpenSearch, and provide a URL pattern in some standard
format. Maybe the page could contain <link rel=oldversions> or such,
with the client appending a query parameter to the given URL, say
time=T where T is an ISO 8601 string.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


daniel at brightbyte

Nov 13, 2009, 1:08 AM

Post #14 of 23 (2244 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

Aryeh Gregor schrieb:
> Doesn't the use of a header here violate the idea of each URL
> representing only one resource? The server will be returning totally
> different things for a GET to the same URL. That seems like it would
> cause all sorts of problems -- not only do caching proxies break
> (which I'd think by itself makes the feature unusable for users behind
> caching proxies), but how do you deal with things like bookmarking, or
> sending a link to a particular version of the page to someone? These
> would become impossible, unless the server goes to the extra effort to
> return a redirect.
>
> It seems to me like a better path would be to have different URLs for
> different dates. The obvious way to do this would be to take an
> approach like OpenSearch, and provide a URL pattern in some standard
> format. Maybe the page could contain <link rel=oldversions> or such,
> with the client appending a query parameter to the given URL, say
> time=T where T is an ISO 8601 string.

How about doing both? If a X-Datetime-Accept header is received, it could
trigger a 302 redirect, pointing at a url that specifies the desired point in time.

-- daniel

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hvdsomp at gmail

Nov 13, 2009, 5:02 AM

Post #15 of 23 (2240 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On Nov 13, 2009, at 2:08, Daniel Kinzler <daniel [at] brightbyte> wrote:

> Aryeh Gregor schrieb:
>> Doesn't the use of a header here violate the idea of each URL
>> representing only one resource? The server will be returning totally
>> different things for a GET to the same URL. That seems like it would
>> cause all sorts of problems -- not only do caching proxies break
>> (which I'd think by itself makes the feature unusable for users
>> behind
>> caching proxies), but how do you deal with things like bookmarking,
>> or
>> sending a link to a particular version of the page to someone? These
>> would become impossible, unless the server goes to the extra effort
>> to
>> return a redirect.
>>
>> It seems to me like a better path would be to have different URLs for
>> different dates. The obvious way to do this would be to take an
>> approach like OpenSearch, and provide a URL pattern in some standard
>> format. Maybe the page could contain <link rel=oldversions> or such,
>> with the client appending a query parameter to the given URL, say
>> time=T where T is an ISO 8601 string.
>
> How about doing both? If a X-Datetime-Accept header is received, it
> could
> trigger a 302 redirect, pointing at a url that specifies the desired
> point in time.

This is exactly what we do in Memento and with the plug-in: datetime
content negotiation (X-Accept-Datetime header) on the generic URI
(say /clock in wikipedia) followed by a 302 redirect to the time-
specific URI (title="clock"&oldid=123456 in wikipedia). The generic
URI is always only serving the current version of the page; the
history URIs are serving the history pages.

Herbert


>
> -- daniel
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


mln at cs

Nov 13, 2009, 9:25 AM

Post #16 of 23 (2233 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

I'd like to expound on Herbert's point below. We chose 302/Location
style CN (instead of 200/Content-Location) to provide more transparency
in the process. So I can link to:

http://en.wikipedia.org/wiki/The_Cribs

but if I have my Memento FF add-on set to:

X-Accept-Datetime: {Tue, 29 January 2009 11:41:00 GMT}

I'll get redirected to:

http://en.wikipedia.org/w/index.php?title=The_Cribs&oldid=187673999

which will show up in my browser's location bar and thus linking, sharing,
etc. will be done with the correct "old" URI. This would not be the case
with 200/Content-Location style CN. If the old version is not what the
user wants to link, share, etc., then turning off the Memento add-on and
doing a reload (possibly a shift-reload) will cause FF to correctly go
back to the original URI (b/c FF does the right thing w/ the 302 semantics
that say you should reuse the original URI).

Wikipedia is sort of a special case in that the URI:

http://en.wikipedia.org/wiki/The_Cribs

will return both the current representation as well as an older
representation (if CN is requested by the client). That is, that URI is
both URI-R and URI-G in the parlance of:

http://www.mementoweb.org/guide/http/local/

Most servers that are not hooked to a CMS (like a wiki) will have URI-G be
a separate URI, presumably in a separate archive. See:

http://www.mementoweb.org/guide/http/remote/

There is already support for caching & CN, see:

http://httpd.apache.org/docs/2.3/content-negotiation.html#caching
http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.6

Of course, the current caches don't know about "X-Accept-Datetime", but
that can come in the future (esp. when an RFC is written and the "X-" are
removed from the various headers introduced by Memento). I'm not sure if
they'll need to be aware of "Accept-Datetime" specifically, or (hopefully)
they'll do the right thing with whatever values are returned in the "Vary"
response header. We'll see.

The goal of introducing a 5th dimension for CN (to complement type,
encoding, language & charset) is that we are more likely to integrate with
the existing http infrastructure. More so, we suspect, than introducing
an RPC-like convention of arguments tacked onto URIs (e.g.,
"foo?datetime=xxx" or "foo?datetime=now") or overloading URI fragments.

regards,

Michael

On Fri, 13 Nov 2009, Herbert Van de Sompel wrote:

> On Nov 13, 2009, at 2:08, Daniel Kinzler <daniel [at] brightbyte> wrote:
>
>> Aryeh Gregor schrieb:
>>> Doesn't the use of a header here violate the idea of each URL
>>> representing only one resource? The server will be returning totally
>>> different things for a GET to the same URL. That seems like it would
>>> cause all sorts of problems -- not only do caching proxies break
>>> (which I'd think by itself makes the feature unusable for users
>>> behind
>>> caching proxies), but how do you deal with things like bookmarking,
>>> or
>>> sending a link to a particular version of the page to someone? These
>>> would become impossible, unless the server goes to the extra effort
>>> to
>>> return a redirect.
>>>
>>> It seems to me like a better path would be to have different URLs for
>>> different dates. The obvious way to do this would be to take an
>>> approach like OpenSearch, and provide a URL pattern in some standard
>>> format. Maybe the page could contain <link rel=oldversions> or such,
>>> with the client appending a query parameter to the given URL, say
>>> time=T where T is an ISO 8601 string.
>>
>> How about doing both? If a X-Datetime-Accept header is received, it
>> could
>> trigger a 302 redirect, pointing at a url that specifies the desired
>> point in time.
>
> This is exactly what we do in Memento and with the plug-in: datetime
> content negotiation (X-Accept-Datetime header) on the generic URI
> (say /clock in wikipedia) followed by a 302 redirect to the time-
> specific URI (title="clock"&oldid=123456 in wikipedia). The generic
> URI is always only serving the current version of the page; the
> history URIs are serving the history pages.
>
> Herbert
>
>
>>
>> -- daniel
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l [at] lists
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

----
Michael L. Nelson mln [at] cs http://www.cs.odu.edu/~mln/
Dept of Computer Science, Old Dominion University, Norfolk VA 23529
+1 757 683 6393 +1 757 683 4900 (f)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


smolensk at eunet

Nov 13, 2009, 9:59 AM

Post #17 of 23 (2224 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

Дана Thursday 12 November 2009 16:52:54 Aryeh Gregor написа:
> On Thu, Nov 12, 2009 at 10:43 AM, Nikola Smolenski <smolensk [at] eunet>
wrote:
> > I'd say it is, if sufficiently precise :)
>
> MediaWiki only keeps timestamps to one-second precision, so it's not.

I propose the following heuristics:

1. If appropriate timestamp doesn't exist in the database, use the newest one
older than the requested one.

2. If it exists, and only one revision has the timestamp, use that revision.

3. If more than one revision share the same timestamp, divide the second in
the number of revisions parts, and use the revision that falls in the
requested timestamp.

Suppose that someone asks for Wikipedia as it looked on 2009-11-13
18:53:11.4281. There are foutr revisions that have 2009-11-13 18:53:11
timestamp, revisions 123456, 123457, 123459 and 123460. Each revision gets
its quarter of the second, and since the request falls in the 2nd quarter,
use revision 123457.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


mln at cs

Nov 13, 2009, 10:36 AM

Post #18 of 23 (2222 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

the scenario of multiple URIs for a single Datetime (second granularity,
which I think is all that RFC-822/RFC-1123 format supports) might be a
good candidate for http response "300 Multiple choices":

http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.1

the entity sent back with the 300 could be:

1. a TimeMap (read: ORE Resource Map), in Atom, RDF, or whatever (see the
RDF example at: http://www.mementoweb.org/guide/api/map1.rdf)

2. a custom mediawiki html entity, like a history page with just the
values for that Datetime, that allows the user to browse, compare, &
select the version they desire.

3. a combination of #1 with an XSLT that transforms the XML into an HTML
with the functionality of #2.

4. other ideas?

regards,

Michael

On Fri, 13 Nov 2009, Nikola Smolenski wrote:

> Дана Thursday 12 November 2009 16:52:54 Aryeh Gregor написа:
>> On Thu, Nov 12, 2009 at 10:43 AM, Nikola Smolenski <smolensk [at] eunet>
> wrote:
>> > I'd say it is, if sufficiently precise :)
>>
>> MediaWiki only keeps timestamps to one-second precision, so it's not.
>
> I propose the following heuristics:
>
> 1. If appropriate timestamp doesn't exist in the database, use the newest one
> older than the requested one.
>
> 2. If it exists, and only one revision has the timestamp, use that revision.
>
> 3. If more than one revision share the same timestamp, divide the second in
> the number of revisions parts, and use the revision that falls in the
> requested timestamp.
>
> Suppose that someone asks for Wikipedia as it looked on 2009-11-13
> 18:53:11.4281. There are foutr revisions that have 2009-11-13 18:53:11
> timestamp, revisions 123456, 123457, 123459 and 123460. Each revision gets
> its quarter of the second, and since the request falls in the 2nd quarter,
> use revision 123457.
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

----
Michael L. Nelson mln [at] cs http://www.cs.odu.edu/~mln/
Dept of Computer Science, Old Dominion University, Norfolk VA 23529
+1 757 683 6393 +1 757 683 4900 (f)


agarrett at wikimedia

Nov 13, 2009, 1:55 PM

Post #19 of 23 (2221 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On 13/11/2009, at 2:25 AM, Aryeh Gregor wrote:

> On Thu, Nov 12, 2009 at 3:55 PM, Herbert Van de Sompel
> <hvdsomp [at] gmail> wrote:
>> 2.1. The plug-in detects a client's X-Accept-Datetime header, and
>> returns the mediawiki page that was active at the datetime specified
>> in the header. Same for images, actually. This effectively allows
>> navigating (as in clicking links) a mediawiki collection as it
>> existed
>> in the past: as long as a client issues an X-Accept-Datetime header,
>> matching history pages/images will be retrieved.
>
> Doesn't the use of a header here violate the idea of each URL
> representing only one resource? The server will be returning totally
> different things for a GET to the same URL. That seems like it would
> cause all sorts of problems -- not only do caching proxies break
> (which I'd think by itself makes the feature unusable for users behind
> caching proxies), but how do you deal with things like bookmarking, or
> sending a link to a particular version of the page to someone? These
> would become impossible, unless the server goes to the extra effort to
> return a redirect.

I assume the solution to this would be a Vary: X-Accept-Datetime header.

--
Andrew Garrett
agarrett [at] wikimedia
http://werdn.us/


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hvdsomp at gmail

Nov 13, 2009, 2:27 PM

Post #20 of 23 (2220 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On Nov 13, 2009, at 2:55 PM, Andrew Garrett wrote:
> On 13/11/2009, at 2:25 AM, Aryeh Gregor wrote:
>
>> On Thu, Nov 12, 2009 at 3:55 PM, Herbert Van de Sompel
>> <hvdsomp [at] gmail> wrote:
>>> 2.1. The plug-in detects a client's X-Accept-Datetime header, and
>>> returns the mediawiki page that was active at the datetime specified
>>> in the header. Same for images, actually. This effectively allows
>>> navigating (as in clicking links) a mediawiki collection as it
>>> existed
>>> in the past: as long as a client issues an X-Accept-Datetime header,
>>> matching history pages/images will be retrieved.
>>
>> Doesn't the use of a header here violate the idea of each URL
>> representing only one resource? The server will be returning totally
>> different things for a GET to the same URL. That seems like it would
>> cause all sorts of problems -- not only do caching proxies break
>> (which I'd think by itself makes the feature unusable for users
>> behind
>> caching proxies), but how do you deal with things like bookmarking,
>> or
>> sending a link to a particular version of the page to someone? These
>> would become impossible, unless the server goes to the extra effort
>> to
>> return a redirect.
>
> I assume the solution to this would be a Vary: X-Accept-Datetime
> header.


Please have a look at the HTTP Transactions for datetime content
negotiation available at:

http://www.mementoweb.org/guide/http/local/

This shows that we indeed include a response header:

Vary: negotiate, X-Accept-Datetime

Cheers

Herbert Van de Sompel

==
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/
tel. +1 505 667 1267





_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


stevagewp at gmail

Nov 15, 2009, 6:11 AM

Post #21 of 23 (2154 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On Fri, Nov 13, 2009 at 2:43 AM, Nikola Smolenski <smolensk [at] eunet> wrote:
>> * the timestamp isn't a unique identifier, multiple revisions *might* have the
>> same timestamp. We need a tiebreak (rev_id would be the obvious choice).
>
> I'd say it is, if sufficiently precise :) If not, either use the
> lowest/highest rev_id, or the user could be asked to choose a version.

Seems like a non-issue. User requests the page as it was on the 18th
of december 2006, at 16:45:12 UTC. Which of two (or more versions of
the page) stored within that second is returned is academic, isn't it?
If they know there are two versions and want to refer to a specific
one, they should use a rev_id, not a time.

Steve

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hariharshankar at gmail

Nov 16, 2009, 1:03 PM

Post #22 of 23 (2080 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

The extension and the wiki page has been updated. It now resolves this
issue with multiple revisions having the same timestamp, by returning an
'HTTP/1.1 300 Multiple Choices' with the list of revision URIs that has
the same timestamp.
-Harihar

Steve Bennett wrote:
> On Fri, Nov 13, 2009 at 2:43 AM, Nikola Smolenski <smolensk [at] eunet> wrote:
>
>>> * the timestamp isn't a unique identifier, multiple revisions *might* have the
>>> same timestamp. We need a tiebreak (rev_id would be the obvious choice).
>>>
>> I'd say it is, if sufficiently precise :) If not, either use the
>> lowest/highest rev_id, or the user could be asked to choose a version.
>>
>
> Seems like a non-issue. User requests the page as it was on the 18th
> of december 2006, at 16:45:12 UTC. Which of two (or more versions of
> the page) stored within that second is returned is academic, isn't it?
> If they know there are two versions and want to refer to a specific
> one, they should use a rev_id, not a time.
>
> Steve
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


agarrett at wikimedia

Nov 16, 2009, 1:28 PM

Post #23 of 23 (2079 views)
Permalink
Re: memento: time warp for mediawiki [In reply to]

On 12/11/2009, at 1:13 PM, Daniel Kinzler wrote:

> Hi all
>
> The Memento Project <http://www.mementoweb.org/> (including the Los
> Alamos
> National Laboratory (!) featuring Herbert Van de Sompel of OpenURL
> fame) is
> proposing a new HTTP header, X-Accept-Datetime, to fetch old
> versions of a web
> resource. They already wrote a MediaWiki extension for this
> <http://www.mediawiki.org/wiki/Extension:Memento> - which would of
> course be
> particularly interesting for use on Wikipedia.
>
> Do you think we could have this for Wikimedia project? I think that
> would be
> very nice indeed. I recall that ways to look at last weeks main page
> have been
> discussed before, and I see several issues:
>
> * the timestamp isn't a unique identifier, multiple revisions
> *might* have the
> same timestamp. We need a tiebreak (rev_id would be the obvious
> choice).
> * templates and images also need to be "time warped". It seems like
> the
> extension does not address this at the moment. For flagged revisions
> we do have
> such a machnism, right? Could that be used here?
> * Squids would need to know about the new header, and by pass the
> cache when
> it's used.
>
> so, what do you think? what does it take? Can we point them to the
> missing bits?

This got written up in New Scientist today, for those who are
interested.

http://www.newscientist.com/article/dn18158-timetravelling-browsers-navigate-the-webs-past.html?DCMP=OTC-rss&nsref=online-news

--
Andrew Garrett
agarrett [at] wikimedia
http://werdn.us/


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.