Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Foundation

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

 

 

Wikipedia foundation RSS feed   Index | Next | Previous | View Threaded


saintonge at telus

Jun 20, 2009, 2:35 PM

Post #1 of 17 (897 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Brian wrote:
> That is against the law. It violates Google's ToS.
>
> I'm mostly complaining that Google is being Very Evil. There is nothing we
> can do about it except complain to them. Which I don't know how to do - they
> apparently believe that the plain text versions of their books are akin to
> their intellectual property and are unwilling to give them away.
>
>
How is violating Google's ToS against the law? Sites put all sorts of
meaningless garbage into these documents, and users mostly ignore them.

Of course Google's evil; it's about time that people noticed that. They
use their deep pockets as a way to bully other sites ... with a smile.
Fortunately the U.S. does not have database protection laws like the
E.U. Ideally, every PD item they host should also be hosted on an
alternative site, but that's a massive undertaking, ... and they know
it. Nothing requires them to be nice to the competition, such as by
making it easy to copy their material.

Ec

_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


geo.plrd at yahoo

Jun 20, 2009, 3:32 PM

Post #2 of 17 (869 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

If a bot has a meaningful effect on server load (i.e. page requests), it falls under the category of malicious software, which is highly illegal.




________________________________
From: Ray Saintonge <saintonge[at]telus.net>
To: Wikimedia Foundation Mailing List <foundation-l[at]lists.wikimedia.org>
Sent: Saturday, June 20, 2009 2:35:52 PM
Subject: Re: [Foundation-l] Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Brian wrote:
> That is against the law. It violates Google's ToS.
>
> I'm mostly complaining that Google is being Very Evil. There is nothing we
> can do about it except complain to them. Which I don't know how to do - they
> apparently believe that the plain text versions of their books are akin to
> their intellectual property and are unwilling to give them away.
>
>
How is violating Google's ToS against the law? Sites put all sorts of
meaningless garbage into these documents, and users mostly ignore them.

Of course Google's evil; it's about time that people noticed that. They
use their deep pockets as a way to bully other sites ... with a smile.
Fortunately the U.S. does not have database protection laws like the
E.U. Ideally, every PD item they host should also be hosted on an
alternative site, but that's a massive undertaking, ... and they know
it. Nothing requires them to be nice to the competition, such as by
making it easy to copy their material.

Ec

_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l




_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


saintonge at telus

Jun 20, 2009, 4:56 PM

Post #3 of 17 (869 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Anthony wrote:
> Wow, what's Wikipedia's policy about using a bot to scrape everything?
>

I don't know about any policy, but I think it should still be
discouraged. For me this has less to do with predation on other sites
than with our inability to keep up with the volume of data that would be
produced. Proofreading and wikifying are labour-intensive processes.
It is very easy for the technically minded to bring the scan and OCR of
a 500-page book under our roof, but without the manpower to bring the
added value these processes are scarcely better than data dumps.

Ec
> On Sat, Jun 20, 2009 at 2:47 PM, Brian <Brian.Mingus[at]colorado.edu> wrote:
>
>> That is against the law. It violates Google's ToS.
>>
>> I'm mostly complaining that Google is being Very Evil. There is nothing we
>> can do about it except complain to them. Which I don't know how to do -
>> they
>> apparently believe that the plain text versions of their books are akin to
>> their intellectual property and are unwilling to give them away.
>>
>> On Sat, Jun 20, 2009 at 12:34 PM, Falcorian wrote:
>>
>>> So the bot just has to run at human speeds so it does not get banned, it
>>> still won't get tired or make unpredictable mistakes. And you can run it
>>> from different IPs to parallelize.
>>>
>>> --Falcorian


_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


saintonge at telus

Jun 20, 2009, 5:07 PM

Post #4 of 17 (870 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Geoffrey Plourde wrote:
> If a bot has a meaningful effect on server load (i.e. page requests), it falls under the category of malicious software, which is highly illegal.
>
Malicious software or overloading servers goes well beyond ignoring a
ToS. Why should downloading whole books from Google have any greater
effect on server load than downloading a whole book of similar length
from Internet Archive?

Ec


> ________________________________
> From: Ray Saintonge
>
>
> Brian wrote:
>
>> That is against the law. It violates Google's ToS.
>>
>> I'm mostly complaining that Google is being Very Evil. There is nothing we
>> can do about it except complain to them. Which I don't know how to do - they
>> apparently believe that the plain text versions of their books are akin to
>> their intellectual property and are unwilling to give them away.
>>
>>
>>
> How is violating Google's ToS against the law? Sites put all sorts of
> meaningless garbage into these documents, and users mostly ignore them.
>
> Of course Google's evil; it's about time that people noticed that. They
> use their deep pockets as a way to bully other sites ... with a smile.
> Fortunately the U.S. does not have database protection laws like the
> E.U. Ideally, every PD item they host should also be hosted on an
> alternative site, but that's a massive undertaking, ... and they know
> it. Nothing requires them to be nice to the competition, such as by
> making it easy to copy their material.
>
> Ec
>


_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


wikimail at inbox

Jun 20, 2009, 5:13 PM

Post #5 of 17 (871 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Evil I tell you. Evil!

On Sat, Jun 20, 2009 at 7:56 PM, Ray Saintonge <saintonge[at]telus.net> wrote:

> Anthony wrote:
> > Wow, what's Wikipedia's policy about using a bot to scrape everything?
> >
>
> I don't know about any policy, but I think it should still be
> discouraged. For me this has less to do with predation on other sites
> than with our inability to keep up with the volume of data that would be
> produced. Proofreading and wikifying are labour-intensive processes.
> It is very easy for the technically minded to bring the scan and OCR of
> a 500-page book under our roof, but without the manpower to bring the
> added value these processes are scarcely better than data dumps.
>
> Ec
> > On Sat, Jun 20, 2009 at 2:47 PM, Brian <Brian.Mingus[at]colorado.edu>
> wrote:
> >
> >> That is against the law. It violates Google's ToS.
> >>
> >> I'm mostly complaining that Google is being Very Evil. There is nothing
> we
> >> can do about it except complain to them. Which I don't know how to do -
> >> they
> >> apparently believe that the plain text versions of their books are akin
> to
> >> their intellectual property and are unwilling to give them away.
> >>
> >> On Sat, Jun 20, 2009 at 12:34 PM, Falcorian wrote:
> >>
> >>> So the bot just has to run at human speeds so it does not get banned,
> it
> >>> still won't get tired or make unpredictable mistakes. And you can run
> it
> >>> from different IPs to parallelize.
> >>>
> >>> --Falcorian
>
>
> _______________________________________________
> foundation-l mailing list
> foundation-l[at]lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


geo.plrd at yahoo

Jun 20, 2009, 9:53 PM

Post #6 of 17 (869 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

A bot or bots calling up massive amounts of data at high speed can have a negative effect on a server. While I doubt the bot we use would have the power to take down a Google server, the speed of the requests and the constant number of requests will definitely be noticeable, possibly leading to unpleasant consequences.




________________________________
From: Ray Saintonge <saintonge[at]telus.net>
To: Wikimedia Foundation Mailing List <foundation-l[at]lists.wikimedia.org>
Sent: Saturday, June 20, 2009 5:07:44 PM
Subject: Re: [Foundation-l] Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Geoffrey Plourde wrote:
> If a bot has a meaningful effect on server load (i.e. page requests), it falls under the category of malicious software, which is highly illegal.
>
Malicious software or overloading servers goes well beyond ignoring a
ToS. Why should downloading whole books from Google have any greater
effect on server load than downloading a whole book of similar length
from Internet Archive?

Ec


> ________________________________
> From: Ray Saintonge
>
>
> Brian wrote:
>
>> That is against the law. It violates Google's ToS.
>>
>> I'm mostly complaining that Google is being Very Evil. There is nothing we
>> can do about it except complain to them. Which I don't know how to do - they
>> apparently believe that the plain text versions of their books are akin to
>> their intellectual property and are unwilling to give them away.
>>
>>
>>
> How is violating Google's ToS against the law? Sites put all sorts of
> meaningless garbage into these documents, and users mostly ignore them.
>
> Of course Google's evil; it's about time that people noticed that. They
> use their deep pockets as a way to bully other sites ... with a smile.
> Fortunately the U.S. does not have database protection laws like the
> E.U. Ideally, every PD item they host should also be hosted on an
> alternative site, but that's a massive undertaking, ... and they know
> it. Nothing requires them to be nice to the competition, such as by
> making it easy to copy their material.
>
> Ec
>


_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l




_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


saintonge at telus

Jun 20, 2009, 10:40 PM

Post #7 of 17 (864 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Geoffrey Plourde wrote:
> A bot or bots calling up massive amounts of data at high speed can have a negative effect on a server. While I doubt the bot we use would have the power to take down a Google server, the speed of the requests and the constant number of requests will definitely be noticeable, possibly leading to unpleasant consequences.
>
And data accumulation at such a high speed would also be more than could
be properly handled at the Wikisource end as well. We regularly get
whole works from Internet Archive and other sources, without any such
problems arising. I would not reasonably expect a greater accumulation
rate from Google.

Ec

> _____________________________
> From: Ray Saintonge <saintonge[at]telus.net>
>
>
> Geoffrey Plourde wrote:
>
>> If a bot has a meaningful effect on server load (i.e. page requests), it falls under the category of malicious software, which is highly illegal.
>>
>>
> Malicious software or overloading servers goes well beyond ignoring a
> ToS. Why should downloading whole books from Google have any greater
> effect on server load than downloading a whole book of similar length
> from Internet Archive?
>
> Ec
>
>
>
>> ________________________________
>> From: Ray Saintonge
>>
>>
>> Brian wrote:
>>
>>
>>> That is against the law. It violates Google's ToS.
>>>
>>> I'm mostly complaining that Google is being Very Evil. There is nothing we
>>> can do about it except complain to them. Which I don't know how to do - they
>>> apparently believe that the plain text versions of their books are akin to
>>> their intellectual property and are unwilling to give them away.
>>>
>>>
>>>
>>>
>> How is violating Google's ToS against the law? Sites put all sorts of
>> meaningless garbage into these documents, and users mostly ignore them.
>>
>> Of course Google's evil; it's about time that people noticed that. They
>> use their deep pockets as a way to bully other sites ... with a smile.
>> Fortunately the U.S. does not have database protection laws like the
>> E.U. Ideally, every PD item they host should also be hosted on an
>> alternative site, but that's a massive undertaking, ... and they know
>> it. Nothing requires them to be nice to the competition, such as by
>> making it easy to copy their material.
>>
>> Ec
>>
>>
>
>


_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


saintonge at telus

Jun 20, 2009, 10:51 PM

Post #8 of 17 (865 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Stephen Bain wrote:
> On Sun, Jun 21, 2009 at 5:27 AM, Parker Higgins<parkerhiggins[at]gmail.com> wrote:
>
>> Except google isn't asserting any kind of copyright control over these
>> books, they're just not making it convenient to download them in your
>> preferred format. Maybe not The Right Thing, but not as boneheaded as suing
>> a party who reprints public domain material, as was the case in Feist v.
>> Rural (the supreme court case you mention.)
>>
> They want people to use their service. Fair enough, given that the
> scanning and OCRing happened on their dime.
>
>
How does that give them any special rights? There are no database
protection laws in the US, and sweat-of-the-brow has been rejected as a
basis for new copyrights.

Ec


_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


saintonge at telus

Jun 20, 2009, 11:17 PM

Post #9 of 17 (864 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Samuel Klein wrote:
> There is a wealth of work done all the time by primary source
> researchers and publishers, which could be improved on by having
> wikisource entries, translations, &c.
>
> Related question : how appropriate would large numbers of public
> domain texts, with page scans and the best available OCR [and
> translations of same], fit with what Wikisource does now? This is
> clearly a wiki project that needs to happen : OCR even at its best
> misses rare meaning-bearing words. If not Wikisource, where should
> this work take place?
>
From my perspective it fits perfectly with the vision that I had of
Wikisource on the first day of its existence. Tim Armstrong
[[User:Tarmstro99]] has already done a considerable amount of valuable
work relating to law on Wikisource. That has been mostly a one-man
project to deal with a massive amount of material. Some have even
proposed deleting all the US Code material on the grounds that we don't
have the ability to keep it up to date. That has prompted some very
interesting questions and ideas about how this kind of stuff might be
handled, but taking those questions to the next level requires lots of
work. Most regular Wikisourcerors already have long personal to-do
lists to keep them busy. So the question is not really about whether
Wikisource should host these goods, it's about recruiting volunteers to
do the hard work.

Ec

> On Sat, Jun 20, 2009 at 11:41 AM, David Gerard<dgerard[at]gmail.com> wrote:
>
>> http://blogs.law.harvard.edu/infolaw/2009/06/19/using-wikisource-as-an-alternative-open-access-repository-for-legal-scholarship/
>>
>> Interesting. How well does this fit with what Wikisource does?
>>
>>
>> - d.
>>
>>


_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


jayvdb at gmail

Jun 21, 2009, 1:17 AM

Post #10 of 17 (851 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

On Sun, Jun 21, 2009 at 1:41 AM, David Gerard <dgerard[at]gmail.com> wrote:
>
> http://blogs.law.harvard.edu/infolaw/2009/06/19/using-wikisource-as-an-alternative-open-access-repository-for-legal-scholarship/
>
> Interesting. How well does this fit with what Wikisource does?

Tim Armstrong is a sysop on Wikisource ... :-) more below..

On Sun, Jun 21, 2009 at 4:17 PM, Ray Saintonge <saintonge[at]telus.net> wrote:
>
> Samuel Klein wrote:
> > There is a wealth of work done all the time by primary source
> > researchers and publishers, which could be improved on by having
> > wikisource entries, translations, &c.
> >
> > Related question : how appropriate would large numbers of public
> > domain texts, with page scans and the best available OCR [and
> > translations of same], fit with what Wikisource does now?  This is
> > clearly a wiki project that needs to happen : OCR even at its best
> > misses rare meaning-bearing words.   If not Wikisource, where should
> > this work take place?

If it was published, Wikisource accepts it. Notability is not a consideration.

The only other "open" project of comparable size is [[Distributed
Proofreaders]]. Here are our statistics:

http://wikisource.org/wiki/Wikisource:ProofreadPage_Statistics

Most of the Wikisource projects accept free translations.

http://wikisource.org/wiki/WS:COORD

The two English Wikisource featured translations are:

http://en.wikisource.org/wiki/Balade_to_Rosemounde
http://en.wikisource.org/wiki/J%27accuse
(also translated into Dutch)

The two biggest translation projects that I know of are:

http://en.wikisource.org/wiki/Romance_of_the_Three_Kingdoms
http://en.wikisource.org/wiki/Bible_(Wikisource)

Another good one is

http://en.wikisource.org/wiki/Max_Havelaar_(Wikisource)

We also have translations of laws, usually relating to copyright.

http://en.wikisource.org/wiki/Ordinance_93-027_of_30_March_1993_on_copyright,_related_rights_and_expressions_of_folklore

>  From my perspective it fits perfectly with the vision that I had of
> Wikisource on the first day of its existence.  Tim Armstrong
> [[User:Tarmstro99]] has already done a considerable amount of valuable
> work relating to law on Wikisource.

Tim has been doing high impact work in this area.

H.R. Rep. No. 94-1476

http://blogs.law.harvard.edu/infolaw/2008/06/17/an-open-access-success-story-just-in-time-for-cali/

U.S. Statutes at Large

http://blogs.law.harvard.edu/infolaw/2008/06/02/public-records-one-jpeg-at-a-time/

http://en.wikisource.org/wiki/United_States_Statutes_at_Large

In regards the USC, the majority of it is a mess, but Title 17 is a
great example of where we are heading.

http://en.wikisource.org/wiki/United_States_Code/Title_17

We also have transcription projects for the UK 1911 copyright act,
which has influenced so many other countries.

http://en.wikisource.org/wiki/Index:The_copyright_act,_1911,_annotated.djvu
http://en.wikisource.org/wiki/Index:A_treatise_upon_the_law_of_copyright.djvu

More can be found from our freshly minted Law index:

http://en.wikisource.org/wiki/Wikisource:Law

Our two featured texts are:
http://en.wikisource.org/wiki/South_Africa_Act_1909
http://en.wikisource.org/wiki/ACLU_v._NSA_(District_Court_opinion)

> Most regular Wikisourcerors already have long personal to-do
> lists to keep them busy. So the question is not really about whether
> Wikisource should host these goods, it's about recruiting volunteers to
> do the hard work.

If people want to help, but dont know where to start, my
recommendation is that they start proofreading the Stat. volume 1, as
this is goldmine of interesting documents, and will be an excellent
example of crowdsourcing of transcription.

http://en.wikisource.org/wiki/Index:United_States_Statutes_at_Large/Volume_1

Enjoy,
John Vandenberg

_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


wikimail at inbox

Jun 21, 2009, 4:17 AM

Post #11 of 17 (850 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

On Sun, Jun 21, 2009 at 1:51 AM, Ray Saintonge <saintonge[at]telus.net> wrote:

> Stephen Bain wrote:
> > On Sun, Jun 21, 2009 at 5:27 AM, Parker Higgins<parkerhiggins[at]gmail.com>
> wrote:
> >
> >> Except google isn't asserting any kind of copyright control over these
> >> books, they're just not making it convenient to download them in your
> >> preferred format. Maybe not The Right Thing, but not as boneheaded as
> suing
> >> a party who reprints public domain material, as was the case in Feist v.
> >> Rural (the supreme court case you mention.)
> >>
> > They want people to use their service. Fair enough, given that the
> > scanning and OCRing happened on their dime.
> >
> >
> How does that give them any special rights? There are no database
> protection laws in the US, and sweat-of-the-brow has been rejected as a
> basis for new copyrights.


You're right, it doesn't give them any *special* rights. They have the same
rights as any other computer owner. Specifically, they have the right to
choose who uses their computers, and how they use them. Whether or not a
terms of service is legally binding is really not the issue. (*) The issue
is whether or not they have a duty to make it *convenient* for you to
download the data. Of course they don't. Why should they be required to
help you put them out of business? That kind of twisted logic might make
sense in the non-profit world (although I still haven't seen the WMF step up
to the plate and make it easy for people to make a full history fork, or
even to download all the images), but Google is not a non-profit
organization. Google would be Evil if it *didn't* protect itself against
this, as it'd be breaking a promise to its shareholders.

(*) Personally, I'm of the opinion that merely accessing a website is not
sufficient to bind a websurfer to a TOS, and that at most a TOS which you do
not have to even click "agree" to is a unilateral contract which can only
impose promises upon the offeror, though this is not a legal opinion but
merely my opinion of what the law should be.
_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


saintonge at telus

Jun 21, 2009, 9:33 PM

Post #12 of 17 (832 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Anthony wrote:
> On Sun, Jun 21, 2009 at 10:55 AM, Anthony wrote:
>
>> Okay, http://www.archive.org/details/catholicencyclo16herbgoog happened to
>> be the first book I randomly picked from Google Book Search. There's no
>> text version.
>>
>> And the text version I find of other editions seems to be much much worse
>> than the google OCR results.
>>
> http://books.google.com/books?id=TZ0UAAAAYAAJ strike two, not even there.
> http://books.google.com/books?id=PYAaAAAAYAAJ strike three
> http://www.archive.org/details/happinessessays00hiltgoog finally...let's
> compare the OCR:
>
> "Great numbers of thoughtful people are just now much perplexed to know what
> to make of the faffs of life, and are looking about them for some reasonable
> interpretation of the modern world. They cannot abandon the work of the
> world, but they are conscious that they have not learned the art of work."
>
> "Greaf numbers of thoughtful people are just now much perplexed to know what
> to make of thefaSls of life^ and are looking about them for some reasonable
> interpretation of the modem world. They cannot abandon the work of the
> worlds but they are conscious that they have not learned the art of work."
> ---
> "Few people, however, really know how to work, and even in an age when
> oftener perhaps than ever before we hear of "work" and "workers" one cannot
> observe that the art of work makes much positive progress. On the contrary,
> the general inclination seems to be to work as little as possible, or to
> work for a short time in order to pass the remainder of one's life in rest."
>
> "Few people, however, really know how to work, and even in an age when
> oftener perhaps than ever before we hear of" work " and " workers " one
> cannotobserve that the art of work makes much positive progress. On the
> contrary, the general inclination seems to be to work as little as possible,
> or to work for a short time in order to pass the remainder of one's life in
> rest. "
> ---
> I guess that's acceptable. The Catholic encyclopedia results were much
> worse, though. Maybe it was a font thing, but I'm not quite interested
> enough to bother doing a more in depth study right now.
.
Who is expecting OCR to be perfect anywhere? In the absence of real
human proofreading I assume any OCR material to be fraught with errors.
Wikisource aims to accurately reproduce what was published, including
original errors. Scans alone provide the needed accuracy, but they are
not suitable for the added value of wikification.

Ec

_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


carnildo at gmail

Jun 22, 2009, 4:49 PM

Post #13 of 17 (806 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

On Sat, Jun 20, 2009 at 14:35, Ray Saintonge<saintonge[at]telus.net> wrote:
> Brian wrote:
>> That is against the law. It violates Google's ToS.
>>
>> I'm mostly complaining that Google is being Very Evil. There is nothing we
>> can do about it except complain to them. Which I don't know how to do - they
>> apparently believe that the plain text versions of their books are akin to
>> their intellectual property and are unwilling to give them away.
>>
>>
> How is violating Google's ToS against the law?

The verdict in _United States v. Lori Drew_ appears to set a precedent
that violating a site's Terms of Service is a violation of the
Computer Fraud and Abuse Act. It's not a very strong precedent, but
it's still there.

--
Mark
[[en:User:Carnildo]]

_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


wikipedia at verizon

Jun 23, 2009, 10:44 AM

Post #14 of 17 (802 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Brian wrote:
> 2009/6/23 Samuel Klein <meta.sj[at]gmail.com>
>
>> Yes, but my understanding is that while google provided part of the mbp
>> data
>> and scans, its continued updates to ocr since then are not being shared. I
>> would be glad to learn this was not the case...
>>
> The dataset you need to train an OCR system to be as good as theirs is the
> raw images and the plain text. They aren't making it easy to get either of
> those things :( They have presumably improved the software in other ways as
> well..
>
> WTF GOOG?
>
Well, when your shorthand uses their stock ticker symbol, your argument
has already been coopted.

--Michael Snow

_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


removed at example

Jun 23, 2009, 10:51 AM

Post #15 of 17 (800 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

On Tue, Jun 23, 2009 at 11:44 AM, Michael Snow <wikipedia[at]verizon.net>wrote:

>
> > The dataset you need to train an OCR system to be as good as theirs is
> the
> > raw images and the plain text. They aren't making it easy to get either
> of
> > those things :( They have presumably improved the software in other ways
> as
> > well..
> >
> > WTF GOOG?
> >
> Well, when your shorthand uses their stock ticker symbol, your argument
> has already been coopted.
>
> --Michael Snow
>

I get the joke but um, I used it on purpose and which one of my arguments
been "coopted" ??
_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


wikipedia at verizon

Jun 23, 2009, 11:13 AM

Post #16 of 17 (801 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Brian wrote:
> On Tue, Jun 23, 2009 at 11:44 AM, Michael Snow <wikipedia[at]verizon.net>wrote:
>
>>> The dataset you need to train an OCR system to be as good as theirs is
>>>
>> the
>>
>>> raw images and the plain text. They aren't making it easy to get either
>>>
>> of
>>
>>> those things :( They have presumably improved the software in other ways
>>>
>> as
>>
>>> well..
>>>
>>> WTF GOOG?
>>>
>> Well, when your shorthand uses their stock ticker symbol, your argument
>> has already been coopted.
>>
>> --Michael Snow
>>
> I get the joke but um, I used it on purpose and which one of my arguments
> been "coopted" ??
>
Coopting is not like rebutting; it does not bite chunks out of specific
pieces, it swallows whole. Symbols are powerful things, perhaps even
more so outside the mathematical logic of argument. They do not serve
only your purposes, even if you use them purposefully. My observations
may be wry, but they are not entirely in jest.

--Michael Snow

_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


removed at example

Jun 23, 2009, 11:24 AM

Post #17 of 17 (802 views)
Permalink
Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship [In reply to]

Ok Shakespeare. But in plain english you appear to be saying that
corporations are inherently greedy and have a tendency to be evil. Sure, but
we expect more out of GOOG. This is not MSFT we are talking about.

On Tue, Jun 23, 2009 at 12:13 PM, Michael Snow <wikipedia[at]verizon.net>wrote:

> Brian wrote:
> > On Tue, Jun 23, 2009 at 11:44 AM, Michael Snow <wikipedia[at]verizon.net
> >wrote:
> >
> >>> The dataset you need to train an OCR system to be as good as theirs is
> >>>
> >> the
> >>
> >>> raw images and the plain text. They aren't making it easy to get either
> >>>
> >> of
> >>
> >>> those things :( They have presumably improved the software in other
> ways
> >>>
> >> as
> >>
> >>> well..
> >>>
> >>> WTF GOOG?
> >>>
> >> Well, when your shorthand uses their stock ticker symbol, your argument
> >> has already been coopted.
> >>
> >> --Michael Snow
> >>
> > I get the joke but um, I used it on purpose and which one of my arguments
> > been "coopted" ??
> >
> Coopting is not like rebutting; it does not bite chunks out of specific
> pieces, it swallows whole. Symbols are powerful things, perhaps even
> more so outside the mathematical logic of argument. They do not serve
> only your purposes, even if you use them purposefully. My observations
> may be wry, but they are not entirely in jest.
>
> --Michael Snow
>
> _______________________________________________
> foundation-l mailing list
> foundation-l[at]lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l[at]lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Wikipedia foundation RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.