Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

Re: Wikitech-l Digest, Vol 65, Issue 34

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


pomahajbo1 at centrum

Dec 28, 2008, 3:44 AM

Post #1 of 1 (428 views)
Permalink
Re: Wikitech-l Digest, Vol 65, Issue 34

SOS...SOS...SOS...HELP...Slovakia-Slovensko,dakujem za E-mail ale neviem čo tam je napísané,lebo neovládam Váš jazyk -prosím Slovenčinu alebo češtinu...

______________________________________________________________
> Od: wikitech-l-request [at] lists
> Komu: wikitech-l [at] lists
> Datum: 28.12.2008 04:16
> Předmět: Wikitech-l Digest, Vol 65, Issue 34
>
>Send Wikitech-l mailing list submissions to
> wikitech-l [at] lists
>
>To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>or, via email, send a message with subject or body 'help' to
> wikitech-l-request [at] lists
>
>You can reach the person managing the list at
> wikitech-l-owner [at] lists
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of Wikitech-l digest..."
>
>
>Today's Topics:
>
>   1. Data center move in Amsterdam: expect some downtime (Mark Bergsma)
>   2. Re: IBM DB2 patch for MediaWiki (Jes?s Quiroga)
>   3. Re: Anchors haven't id attribute (Danny B.)
>   4. Re: Anchors haven't id attribute (Brion Vibber)
>   5. Re: IBM DB2 patch for MediaWiki (Aryeh Gregor)
>   6. Re: Anchors haven't id attribute (Aryeh Gregor)
>   7. Re: Anchors haven't id attribute (Danny B.)
>   8. Re: Anchors haven't id attribute (Aryeh Gregor)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Fri, 26 Dec 2008 22:05:17 +0100
>From: Mark Bergsma <mark [at] wikimedia>
>Subject: [Wikitech-l] Data center move in Amsterdam: expect some
> downtime
>To: Wikimedia developers <wikitech-l [at] lists>, Wikimedia
> Foundation Mailing List <foundation-l [at] lists>
>Message-ID: <4955470D.10503 [at] wikimedia>
>Content-Type: text/plain; charset=ISO-8859-1
>
>In the upcoming days until new years we will be moving our servers and
>other equipment in the Amsterdam data center location to a new data
>center. Unfortunately this might result in some down time and hiccups of
>certain web sites &amp; services, although we will try to keep this to a
>minimum.
>
>On Sunday the 28th, between 09:00 and 11:00 UTC we will migrate our
>network in Amsterdam to new equipment. All services located there will
>be unreachable for a brief period. Traffic for the main wikis will be
>rerouted to the Florida cluster however, and should remain unaffected.
>
>In the days after we will be moving the servers themselves. Some
>services, such as the mailing lists server, the subversion server and
>the toolserver cluster, will be down for a number of hours while the
>equipment is being moved. Traffic for the wikis should again remain
>largely unaffected.
>
>We hope to have the entire migration finished before we enter the last
>few hours of 2008... and start 2009 with a clean sheet. Happy Holidays
>everyone!
>
>--
>Mark Bergsma <mark [at] wikimedia>
>System &amp; Network Administrator, Wikimedia Foundation
>
>
>
>------------------------------
>
>Message: 2
>Date: Sat, 27 Dec 2008 07:23:00 +0100
>From: Jes?s Quiroga <jquiroga [at] pobox>
>Subject: Re: [Wikitech-l] IBM DB2 patch for MediaWiki
>To: Wikimedia developers <wikitech-l [at] lists>
>Message-ID: <4955C9C4.9080509 [at] pobox>
>Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>
>Hello.
>
>After a few days of pondering the issues, I would like to explain what I
>suggested in my previous message, in more detail and (hopefully) more
>clearly.
>
>What I'm about to say is pretty abstract, so it's difficult to convey
>the right meaning. Please forgive me if I say something you already
>know, or just nonsense :-)
>
>
>Jes?s Quiroga escribi?:
>> I believe a better solution is to design a domain-specific language, an
>> idea not very different from your first one.
>> This DSL would model the interaction between the application and the DB
>> as it is now, and would be designed to evolve. That's it.
>>  
>
>The problem I discuss is how to best access the data store from an
>application. I believe the right answer is different for each project,
>but it's not difficult to evaluate the alternatives, one by one, in a
>given context. I think it is worthwhile to do that in the context of
>MediaWiki.
>
>I will refer to wiki modules and databases as if they were 'hosts'
>connected to a 'network', to highlight the role of languages in the
>operation of the system at runtime.
>
>
>The first way to access the data store is the 'direct' one:
>
>    [polyglot wiki] <--- mysDataL ---> [mysql]
>    [polyglot wiki] <--- posDataL ---> [postgresql]
>    [polyglot wiki] <--- db2DataL ---> [db2]
>
>Here, the polyglot wiki module talks to every database using the proper
>languages. 'mysDataL' means 'the data language understood by MySQL',
>'posDataL' means 'the data language understood by PostgreSQL', etc.
>
>The polyglot wiki promises to learn several languages and to speak them
>correctly forever, so, if a new database comes along or any of their
>data languages evolves, the polyglot wiki is forced to adapt at a
>potentially great cost. Besides, any change to the database schema can
>trigger lots of updates to the wiki code, and be very costly too.
>
>The advantages of this way are well known: it is fast, no need to do
>design, easy to understand.
>The drawbacks are apparently few, but devastating: verbose and complex
>code in multiple places in the wiki module, very costly to maintain,
>even more costly to evolve. All changes cost a lot, in time and effort.
>
>
>
>The second way to access the data store that is usually considered is
>the 'indirect' one:
>
>    [wiki] <--- wikiDataL ---> [polyglot translator]
>
>    [polyglot translator] <--- mysDataL ---> [mysql]
>    [polyglot translator] <--- posDataL ---> [postgresql]
>    [polyglot translator] <--- db2DataL ---> [db2]
>
>Here, wikiDataL means 'some relational data definition and manipulation
>language suitable for use by the wiki'.
>
>The polyglot translator promises to learn wikiDataL and the other
>dialects and to evolve with them, so it has all the problems the wiki
>had in the direct way, but now the cost is lower because a lot of
>complexity is 'hidden' inside the translator and can't reach the wiki.
>As a result, wiki code is not updated as much, and it's much cleaner and
>less verbose.
>
>The advantages of this way are: wiki module code is simpler, cost of
>evolution is reduced.
>The drawbacks are apparently many: it's slower, design is needed, harder
>to understand, a new language (wikiDataL), translator can be very
>complex. However, the need to reduce the cost to achieve change is
>usually so great that these inconveniences are minor in comparison.
>
>
>
>Now the interesting bit begins. A third possible way to access the data
>store, the 'interpreted' one:
>
>    [wiki] <--- wikiNeedL ---> [polyglot interpreter]
>
>    [polyglot interpreter] <--- mysDataL ---> [mysql]
>    [polyglot interpreter] <--- posDataL ---> [postgresql]
>    [polyglot interpreter] <--- db2DataL ---> [db2]
>
>Here, wikiNeedL means 'some language adequate for the wiki to express
>its data access needs and nothing else'.
>
>wikiNeedL is the domain-specific language I wrote about in my previous
>message.
>
>The differences between wikiDataL and wikiNeedL are mainly these:
>   - wikiNeedL would contain just enough wiki concepts to express the
>wiki's needs, so it's effectively confined to that domain. wikiDataL
>belongs to the relational data model domain, which is quite different.
>   - in general, wikiNeedL would have different semantics than the
>dialects understood by the databases, so the translation step becomes
>more like interpretation, rather than just syntactic transformations.
>wikiDataL usually has the same semantics than the dialects.
>   - wikiNeedL would contain just enough concepts to satisfy current
>needs, and will be open to extension. wikiDataL aims to be
>general-purpose and to fulfill current and future needs.
>
>The main reason to consider the 'interpreted' way is, of course, that it
>helps reduce even more the cost to achieve change.
>
>
>
>So that's what I was talking about. I will say more about the
>differences between the indirect and the interpreted ways in a future
>message.
>
>
>
>Thanks for your attention.
>
>
>
>
>
>------------------------------
>
>Message: 3
>Date: Sat, 27 Dec 2008 13:05:53 +0100 (CET)
>From: Danny B.<Wikipedia.Danny.B [at] email>
>Subject: Re: [Wikitech-l] Anchors haven't id attribute
>To: Wikimedia developers<wikitech-l [at] lists>
>Message-ID: <18263.21683-30277-135341947-1230379553 [at] email>
>Content-Type: text/plain; charset="iso-8859-2"
>
>> ------------ P?vodn? zpr?va ------------
>> Od: Brion Vibber <brion [at] wikimedia>
>> P?edm?t: Re: [Wikitech-l] Anchors haven't id attribute
>> Datum: 26.12.2008 06:30:00
>> ----------------------------------------
>> On 12/25/08 4:32 AM, Danny B. wrote:
>> > I have reverted both revisions in r45021 and r45022 because it caused massive
>> invalidity of pages.
>>
>> Given that we've been outputting these as "id" attributes for the last
>> few years already (as output by Tidy), I have reverted your revert in
>> r45044 pending further discussion.
>>
>> -- brion
>
>Well, the id was added _only_ to those tags, where name was transferable to id - thus had to start with ASCII letter. _Never_ to those, which did not conform this rule (the regexp mentioned in my previous post). Easily provable by either running older revision of MediaWiki or testing in Tidy directly:
>
>Take this code excerpt (and wrap it with minimal XHTML document stuff) and run it through Tidy:
>
><a name="X"></a><h2> <span class="mw-headline"> X </span></h2>
><a name="1X"></a><h2> <span class="mw-headline"> 1X </span></h2>
><a name=".C3.81X"></a><h2> <span class="mw-headline"> ?X </span></h2>
><a name="-X"></a><h2> <span class="mw-headline"> -X </span></h2>
>
>The result will be:
>
><a name="X" id="X"></a><h2><span class="mw-headline">X</span></h2>
><a name="1X"></a><h2><span class="mw-headline">1X</span></h2>
><a name=".C3.81X"></a><h2><span class="mw-headline">?X</span></h2>
><a name="-X"></a><h2><span class="mw-headline">-X</span></h2>
>
>Now, let me repeat, how the "id" is defined:
>
>1: XHTML is reformulation of HTML 4 as an XML 1.0 application.
>2: That means it takes every single definition from HTML 4 and keeps it unless it is overriden in XHTML.
>3: The id and name has been defined in HTML 4 as /[A-Za-z][A-Za-z0-9:_.-]*/  [1] [2]
>4: The name has been redefined to NMTOKEN  [2] [3]
>5: The id has never been redefined thus stays on definition mentioned in point 3 above.
>
>This is how the id in XHTML was always handled since the XHTML is out. I also think that such important thing like handling of id is, was fixed in validator during so many years if it wasn't correct.
>
>So currently, all non-latin-chars wikis are now totally invalid according to W3C validator. Major parts of non-ASCII-chars wikis are invalid as well. Therefore is very hard to find other invalid mistakes in code when having worthless positives on every other page. :-(
>
>Also one thing at the end: I think that the current rendering with controversial ids brought more negatives (such as much lowering down the ability to find the real invalid parts of the code) than positives - well, it was working correctly before, so what benefit it actually brought? On the other hand it brought this controversy.
>
>I take the point that I (and majority of people over the world, the validator, Tidy and so many other tools etc.) _may_ be wrong with the interpretation of definition of id. But I guess unless the authority tools, as validator or Tidy are, are fixed in this issue - thus can be proved we render the page correctly - we should not render that way. As I mentioned above - it was working correctly before so there is no urge to force the new rendering since it is not correcting any mistake or misfunctionality.
>
>[1] http://www.w3.org/TR/html401/types.html#type-name
>[2] http://www.w3.org/TR/xhtml1/#C_8
>[3] http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Nmtoken
>
>
>Kind regards
>
>
>Danny B.
>
>
>
>------------------------------
>
>Message: 4
>Date: Sat, 27 Dec 2008 12:14:33 -0800
>From: Brion Vibber <brion [at] wikimedia>
>Subject: Re: [Wikitech-l] Anchors haven't id attribute
>To: Wikimedia developers <wikitech-l [at] lists>
>Message-ID: <49568CA9.6090104 [at] wikimedia>
>Content-Type: text/plain; charset=ISO-8859-2; format=flowed
>
>[snip]
>
>Maybe we should just fix the normalization function the way we'd already
>planned to, so that it'll work right the way we'd already planned to?
>
>-- brion
>
>
>
>------------------------------
>
>Message: 5
>Date: Sat, 27 Dec 2008 18:25:10 -0500
>From: "Aryeh Gregor" <Simetrical+wikilist [at] gmail>
>Subject: Re: [Wikitech-l] IBM DB2 patch for MediaWiki
>To: "Wikimedia developers" <wikitech-l [at] lists>
>Message-ID:
> <7c2a12e20812271525g3055d1ffr855bc071028262b [at] mail>
>Content-Type: text/plain; charset=UTF-8
>
>On Sat, Dec 27, 2008 at 1:23 AM, Jes?s Quiroga <jquiroga [at] pobox> wrote:
>> The second way to access the data store that is usually considered is
>> the 'indirect' one:
>>
>>    [wiki] <--- wikiDataL ---> [polyglot translator]
>>
>>    [polyglot translator] <--- mysDataL ---> [mysql]
>>    [polyglot translator] <--- posDataL ---> [postgresql]
>>    [polyglot translator] <--- db2DataL ---> [db2]
>>
>> Here, wikiDataL means 'some relational data definition and manipulation
>> language suitable for use by the wiki'.
>
>This is what we currently use, and I don't think we're going to
>seriously consider changing it without some very compelling arguments
>being presented.  Incremental improvements to our current way of doing
>things (cutting back on raw queries, moving MySQL-specific stuff from
>Database to DatabaseMySql, defining more clearly what Database methods
>mean and avoiding undefined behavior) seem entirely sufficient to
>allow support for any number of additional database backends.
>
>> The differences between wikiDataL and wikiNeedL are mainly these:
>>   - wikiNeedL would contain just enough wiki concepts to express the
>> wiki's needs, so it's effectively confined to that domain. wikiDataL
>> belongs to the relational data model domain, which is quite different.
>>   - in general, wikiNeedL would have different semantics than the
>> dialects understood by the databases, so the translation step becomes
>> more like interpretation, rather than just syntactic transformations.
>> wikiDataL usually has the same semantics than the dialects.
>>   - wikiNeedL would contain just enough concepts to satisfy current
>> needs, and will be open to extension. wikiDataL aims to be
>> general-purpose and to fulfill current and future needs.
>
>In practice, wikiNeedL would be drastically more complicated, if I
>understand you correctly.  Its basic semantic units would be things
>like articles, users, revisions, etc., instead of rows, columns, and
>tables.  We *have* a wikiNeedL, in fact: it's called "calling the
>appropriate Article method" or whatever.  Most code doesn't have to
>manually do queries.  Further abstraction of the database queries
>would be possible, but I question its usefulness.
>
>------------------------------
>
>Message: 6
>Date: Sat, 27 Dec 2008 19:06:24 -0500
>From: "Aryeh Gregor" <Simetrical+wikilist [at] gmail>
>Subject: Re: [Wikitech-l] Anchors haven't id attribute
>To: "Wikimedia developers" <wikitech-l [at] lists>
>Message-ID:
> <7c2a12e20812271606u6b188edj22a6579803ccd43d [at] mail>
>Content-Type: text/plain; charset=UTF-8
>
>On Sat, Dec 27, 2008 at 3:14 PM, Brion Vibber <brion [at] wikimedia> wrote:
>> [snip]
>>
>> Maybe we should just fix the normalization function the way we'd already
>> planned to, so that it'll work right the way we'd already planned to?
>
>Done in r45109.  I notice, by the way, that HTML5 allows any string
>not containing whitespace for id's . . . yet another case where it
>clearly wins the "don't gratuitously cause pain to developers"
>contest.
>
>
>
>------------------------------
>
>Message: 7
>Date: Sun, 28 Dec 2008 03:02:26 +0100 (CET)
>From: Danny B.<Wikipedia.Danny.B [at] email>
>Subject: Re: [Wikitech-l] Anchors haven't id attribute
>To: Wikimedia developers<wikitech-l [at] lists>
>Message-ID: <18278.21698-2886-1817746719-1230429746 [at] email>
>Content-Type: text/plain; charset="iso-8859-2"
>
>> ------------ P?vodn? zpr?va ------------
>> Od: Aryeh Gregor <Simetrical+wikilist [at] gmail>
>> P?edm?t: Re: [Wikitech-l] Anchors haven't id attribute
>> Datum: 28.12.2008 01:07:08
>> ----------------------------------------
>> On Sat, Dec 27, 2008 at 3:14 PM, Brion Vibber <brion [at] wikimedia> wrote:
>> > [snip]
>> >
>> > Maybe we should just fix the normalization function the way we'd already
>> > planned to, so that it'll work right the way we'd already planned to?
>>
>> Done in r45109.  I notice, by the way, that HTML5 allows any string
>> not containing whitespace for id's . . . yet another case where it
>> clearly wins the "don't gratuitously cause pain to developers"
>> contest.
>
>*sigh*
>
>Why do we have to hunt for some other solution when we have fully working, fully valid and fully intuitive one?
>
>OK, let's make some summary about three versions we have:
>
>Terms used:
>- old version - the for-many-years used version until r44896
>- mid version - r44896 way
>- new version - r45109 way
>
>Old version was used for many years. It was fully valid - ids were only there where they could have been copied from name AND comply to the regexp mentioned in previous posts. It has been done automatically by Tidy. And it was fully intuitive - you just wrote [[#Foo]] and it linked to section named Foo. Or you've added #Foo in URL in address bar and you got to the proper section as well. And it was fully working properly.
>
>The mid version brought the "feature" that all name attributes have been duplicated to ids. That caused massive invalidity of pages, especially non-latin and non-ASCII. However, the intuitivity of anchors creation has still been kept.
>
>The new version prepends x to all anchors to solve the problem which was spread here in mid version - the massive invalidity of pages. So it solved one problem (which actually didn't have to be solved if we kept the old version) but brought at least two major other:
>First major problem is, that this change is breaking millions of existing links to sections. Links used on pages on wikis, links used on external sites, links in people's bookmarks, in emails, forum threads etc. Well, OK, let's discount all external stuff, since we don't have any influence on it, but we still have millions of links left on our own wikis which won't work anymore since r45109.
>The other major problem is, that since this point further the anchor links are no longer intuitive - we are now pushing people to constantly think about prepending x when creating anchor links. No more simple copy pasting of the headline.
>As a side effect we are now adding unnecessary work to people from non-latin wikis by pushing them to always switch to latin keyboard, or to click on edittools or whatever just to get the one "x" character in editbox to create the anchor link.
>
>So let me summarize in points:
>* First we did not have any problem at all.
>* Second we had one problem.
>* Third we "solved" the problem but created at least two new.
>I am pretty scared what's coming next... :-/
>
>One question for the end: What is the benefit of either mid or new version over the old one - what new functionality or feature it brings or which existing bug it fixes?
>
>
>Kind regards
>
>
>Danny B.
>
>
>
>------------------------------
>
>Message: 8
>Date: Sat, 27 Dec 2008 22:15:24 -0500
>From: "Aryeh Gregor" <Simetrical+wikilist [at] gmail>
>Subject: Re: [Wikitech-l] Anchors haven't id attribute
>To: "Wikimedia developers" <wikitech-l [at] lists>
>Message-ID:
> <7c2a12e20812271915gf2bb722gd33f461fb180b946 [at] mail>
>Content-Type: text/plain; charset=UTF-8
>
>2008/12/27 Danny B. <Wikipedia.Danny.B [at] email>:
>> *sigh*
>>
>> Why do we have to hunt for some other solution when we have fully working, fully valid and fully intuitive one?
>
>Because:
>
>1) Our previous behavior arguably violated the XHTML 1 specification
>by allowing name attributes to begin with nonletters.  Please don't
>ignore this argument because you think it's wrong.  I think you're
>wrong on this issue too, but I don't just ignore your opinion when
>discussing what the software that we *both* develop should do.  Note
>"arguably" in the first sentence here -- your opinion counts as much
>as mine.
>
>2) It's not arguable at all that the XHTML 1 specification strongly
>recommends that <a> elements with a name attribute also have an id
>attribute.  In fact, section 4.10 states: "In order to ensure that
>XHTML 1.0 documents are well-structured XML documents, XHTML 1.0
>documents MUST use the id attribute when defining fragment identifiers
>on the elements listed above [including <a>]."
>
>I'm not saying these reasons outweigh the reasons against, but those
>are the reasons it was done.  In particular, I don't think I've seen
>an argument from you against (2).
>
>> Old version was used for many years. It was fully valid
>
>Could you *please* stop pretending that a debate doesn't even exist
>here?  It's obnoxious and uncivil, and you keep on doing it.
>
>> First major problem is, that this change is breaking millions of existing links to sections. Links used on pages on wikis, links used on external sites, links in people's bookmarks, in emails, forum threads etc. Well, OK, let's discount all external stuff, since we don't have any influence on it, but we still have millions of links left on our own wikis which won't work anymore since r45109.
>
>First of all, all auto-generated internal links (in TOCs) will
>automatically switch to the new format.  Second of all, it should be
>one extra line of code to fix up all manually-created internal links
>as well, so that the x is automatically added as part of the encoding
>process.  (I didn't find where this needed to be done at a quick
>glance.)  So we're only talking about external links here.
>
>This is a one-time cost and I don't think it's a big problem -- at
>worst, a few users will end up on the wrong part of the page.  It
>should be pointed out that this will affect *all* section links on
>non-Latin wikis (since they get encoded to begin with dots and then
>need to start with a letter), but again, only as a one-time cost, and
>only external links (links from external sites or links using external
>link syntax), and it will still get viewers to almost the right place.
>
>> The other major problem is, that since this point further the anchor links are no longer intuitive - we are now pushing people to constantly think about prepending x when creating anchor links. No more simple copy pasting of the headline.
>> As a side effect we are now adding unnecessary work to people from non-latin wikis by pushing them to always switch to latin keyboard, or to click on edittools or whatever just to get the one "x" character in editbox to create the anchor link.
>
>Again, not an issue if internal links are fixed to work correctly.  I
>didn't think about that aspect, but it should be very simple to fix
>(I'd do it now except I'm going to bed).
>
>It seems to me that there are only weak reasons in favor (following
>recommended best practice with no practical effect) and only weak
>reasons against (small one-time transition cost -- unless you're
>correct that there will be longer-term costs, in which case please
>clarify why you think this).  Normally I would say that standards
>compliance by itself (as opposed to standards compliance that brings
>concrete benefit) is worth small one-time costs, although not large
>enough one-time costs and probably not even fairly small recurring
>costs.  So as it stands, without further arguments, I'd still be
>weakly in favor of keeping the current state of trunk, of course with
>the fix for anchors on internal links.
>
>
>
>------------------------------
>
>_______________________________________________
>Wikitech-l mailing list
>Wikitech-l [at] lists
>https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>End of Wikitech-l Digest, Vol 65, Issue 34
>******************************************
>

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.