Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech
Auto-harvesting of dates
 

Index | Next | Previous | View Flat


lars at aronsson

Jul 16, 2002, 1:58 PM


Views: 160
Permalink
Auto-harvesting of dates

It is very easy to write a regexp that will recognize and parse dates in
running text using formats like "July 14, 1789". That kind of automatic
harvesting can be applied when a Wiki article is saved, and the dates
found can be indexed and easily searchable.

Was this alternative ever considered before the introduction of the
current Wikipedia custom of writing [[July 14]], [[1789]]? Many people
have spent their time adding [[]] markup to any years and dates in
Wikipedia articles, and maintain the pages for dates and years. This time
could have been saved if automatic harvesting and indexing had been used
instead. (I am one of those persons.)

The current Wikipedia custom is an entrenched position, that would take
more energy to get out of than I dare think about. However, it is just as
easy to write that regexp to recognize dates in formats like "[[July 14]],
[[1789]]" instead.

If you would like to test a function like this, visit
http://susning.nu/Carl_Wilhelm_Scheele
and click on this Swedish chemist's birth date "9 december 1742".

My regexps recognize "99 monthname 9999", "monthname 9999", "year 9999",
"born 9999", "died 9999", "9999-talet" (centuries and decades), which are
the most common ways to specify dates in Swedish text. Clicking on a date
leads to a search of adjacent dates found in other pages, chronologically
sorted. Monthnames without day number are sorted before the 1st of that
month. Years without months sort before January 1 of that year. Decades
and centuries sort before the first year of the specified interval.


--
Lars Aronsson (lars [at] aronsson)
tel +46-70-7891609
http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/

Subject User Time
Auto-harvesting of dates lars at aronsson Jul 16, 2002, 1:58 PM
    RE: Auto-harvesting of dates magnus.manske at epost Jul 16, 2002, 2:13 PM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.