Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Foundation

largest free content website

 

 

Wikipedia foundation RSS feed   Index | Next | Previous | View Threaded


jayvdb at gmail

Jul 7, 2011, 11:26 PM

Post #1 of 18 (945 views)
Permalink
largest free content website

Is Wikipedia the largest "free content" website? i.e. website
consisting primarily of free content.

http://freedomdefined.org/

The only competitors that I can think of are

1. Project Gutenberg, however they have a few free-gratis etexts
sprinkled through their collection.

2. Million Books Project http://www.ulib.org/

--
John Vandenberg

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


nemowiki at gmail

Jul 8, 2011, 12:00 AM

Post #2 of 18 (925 views)
Permalink
Re: largest free content website [In reply to]

John Vandenberg, 08/07/2011 08:26:
> 1. Project Gutenberg, however they have a few free-gratis etexts
> sprinkled through their collection.
>
> 2. Million Books Project http://www.ulib.org/

What about the Internet Archive? They certainly have much more free
content than us, if you just count bytes.

Nemo

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


jayvdb at gmail

Jul 8, 2011, 12:31 AM

Post #3 of 18 (924 views)
Permalink
Re: largest free content website [In reply to]

On Fri, Jul 8, 2011 at 5:00 PM, Federico Leva (Nemo) <nemowiki [at] gmail> wrote:
>>
> What about the Internet Archive? They certainly have much more free
> content than us, if you just count bytes.

The Internet Archive has many subprojects which are non-free content
(wayback machine) and dubious content (Open source books).

Their Live Music Archive and Moving image collection may be bigger in
terms of bytes.

I'm less confident in the Moving image collection, as they dont
explain why the items are PD. e.g.

http://en.wikipedia.org/wiki/Wikipedia:Media_copyright_questions/Archive/2011/June#Lycanthropus

--
John Vandenberg

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


grinapo at gmail

Jul 8, 2011, 12:50 AM

Post #4 of 18 (923 views)
Permalink
Re: largest free content website [In reply to]

On Fri, Jul 8, 2011 at 08:26, John Vandenberg <jayvdb [at] gmail> wrote:
> 2. Million Books Project http://www.ulib.org/

LOTS of copyrighted and dubious content, by random checking.

g

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


node.ue at gmail

Jul 8, 2011, 1:20 AM

Post #5 of 18 (922 views)
Permalink
Re: largest free content website [In reply to]

Yes, and I'm sure Wikipedia also has lots of copyrighted and dubious
content, as hard as we try...


2011/7/8 Peter Gervai <grinapo [at] gmail>

> On Fri, Jul 8, 2011 at 08:26, John Vandenberg <jayvdb [at] gmail> wrote:
> > 2. Million Books Project http://www.ulib.org/
>
> LOTS of copyrighted and dubious content, by random checking.
>
> g
>
> _______________________________________________
> foundation-l mailing list
> foundation-l [at] lists
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


nemowiki at gmail

Jul 8, 2011, 1:23 AM

Post #6 of 18 (920 views)
Permalink
Re: largest free content website [In reply to]

John Vandenberg, 08/07/2011 09:31:
> The Internet Archive has many subprojects which are non-free content
> (wayback machine) and dubious content (Open source books).

They have 1.5 millions books (texts) only in the 1800-1922 range.
<http://www.archive.org/search.php?query=mediatype%3A%28texts%29%20AND%20date%3A[1800-01-01%20TO%201922-12-31]>

Nemo

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


courcelleswiki at gmail

Jul 8, 2011, 2:30 AM

Post #7 of 18 (922 views)
Permalink
Re: largest free content website [In reply to]

From having looked through Internet Archive's live music collection, I doubt
much, if any of it, would be considered free by our definition. They only
seem to care that the artist was fine with being recorded at the show, and
there's certainly no release to do anything you want to do with the
recording, like there would have to be with a CC-BY-SA release.

The music there is free beer, but you couldn't say, use it commercially or
sell albums of it without falling afoul of the copyright law.

On Fri, Jul 8, 2011 at 3:31 AM, John Vandenberg <jayvdb [at] gmail> wrote:

> On Fri, Jul 8, 2011 at 5:00 PM, Federico Leva (Nemo) <nemowiki [at] gmail>
> wrote:
> >>
> > What about the Internet Archive? They certainly have much more free
> > content than us, if you just count bytes.
>
> The Internet Archive has many subprojects which are non-free content
> (wayback machine) and dubious content (Open source books).
>
> Their Live Music Archive and Moving image collection may be bigger in
> terms of bytes.
>
> I'm less confident in the Moving image collection, as they dont
> explain why the items are PD. e.g.
>
>
> http://en.wikipedia.org/wiki/Wikipedia:Media_copyright_questions/Archive/2011/June#Lycanthropus
>
> --
> John Vandenberg
>
> _______________________________________________
> foundation-l mailing list
> foundation-l [at] lists
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


dgerard at gmail

Jul 8, 2011, 2:38 AM

Post #8 of 18 (922 views)
Permalink
Re: largest free content website [In reply to]

On 8 July 2011 09:20, M. Williamson <node.ue [at] gmail> wrote:

> Yes, and I'm sure Wikipedia also has lots of copyrighted and dubious
> content, as hard as we try...


We're reaching the stage of arguing category membership. This suggests
stepping back:

John, what do you anticipate as the useful purpose for the answer to
the original question of "largest free content site"?

(There may be no non-fuzzy answer.)


- d.

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


node.ue at gmail

Jul 8, 2011, 3:16 AM

Post #9 of 18 (922 views)
Permalink
Re: largest free content website [In reply to]

Well, I just think any repository that lets some non-free works slip through
the cracks by accident, can't suddenly be disqualified unless we're ready to
disqualify Wikipedia too. So what category do we fit into that they do not?

2011/7/8 David Gerard <dgerard [at] gmail>

> On 8 July 2011 09:20, M. Williamson <node.ue [at] gmail> wrote:
>
> > Yes, and I'm sure Wikipedia also has lots of copyrighted and dubious
> > content, as hard as we try...
>
>
> We're reaching the stage of arguing category membership. This suggests
> stepping back:
>
> John, what do you anticipate as the useful purpose for the answer to
> the original question of "largest free content site"?
>
> (There may be no non-fuzzy answer.)
>
>
> - d.
>
> _______________________________________________
> foundation-l mailing list
> foundation-l [at] lists
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


jayvdb at gmail

Jul 8, 2011, 3:38 AM

Post #10 of 18 (921 views)
Permalink
Re: largest free content website [In reply to]

On Fri, Jul 8, 2011 at 7:38 PM, David Gerard <dgerard [at] gmail> wrote:
> On 8 July 2011 09:20, M. Williamson <node.ue [at] gmail> wrote:
>
>> Yes, and I'm sure Wikipedia also has lots of copyrighted and dubious
>> content, as hard as we try...
>
>
> We're reaching the stage of arguing category membership. This suggests
> stepping back:
>
> John, what do you anticipate as the useful purpose for the answer to
> the original question of "largest free content site"?
>
> (There may be no non-fuzzy answer.)

By "free content", I mean [[free content]], which can include
fair-use/fair-dealing, and possibly including a small proportion of
non-free if it is well described.

Wikipedia is the most popular free content website.

http://www.alexa.com/topsites

I'm wondering if it is also safe to say that Wikipedia is the largest
free content website.

--
John Vandenberg

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


jayvdb at gmail

Jul 8, 2011, 4:04 AM

Post #11 of 18 (923 views)
Permalink
Re: largest free content website [In reply to]

On Fri, Jul 8, 2011 at 8:16 PM, M. Williamson <node.ue [at] gmail> wrote:
> Well, I just think any repository that lets some non-free works slip through
> the cracks by accident, can't suddenly be disqualified unless we're ready to
> disqualify Wikipedia too. So what category do we fit into that they do not?

I wouldn't disqualify any project which is effectively employing best
efforts to remove non-free content.

My limited browsing of Open Source Books project indicates it is
mostly junk and has a very high percentage of non-free content, and
that it doesn't appear that they are staying on top of it.

The Million Book Project is a bit better, but they often don't include
sufficient metadata and I've seen many works with a year of
publication that is post 1950 yet "pre-1923" is used as the public
domain justification.

Here are two that I noted as copyrighted back in May.
http://www.archive.org/details/DeskWorkEnglishGrammer
http://www.archive.org/details/PearsCyclopaedia

Here is one of a series that I noted as copyrighted back in 2008.
http://www.archive.org/details/americasmusic030111mbp

--
John Vandenberg

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


grinapo at gmail

Jul 8, 2011, 5:00 AM

Post #12 of 18 (921 views)
Permalink
Re: largest free content website [In reply to]

On Fri, Jul 8, 2011 at 13:04, John Vandenberg <jayvdb [at] gmail> wrote:
> The Million Book Project is a bit better, but they often don't include
> sufficient metadata and I've seen many works with a year of
> publication that is post 1950 yet "pre-1923" is used as the public
> domain justification.

Or they actually list that it's copyrighted and show only 15% of the content.

g

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


geniice at gmail

Jul 8, 2011, 8:19 AM

Post #13 of 18 (918 views)
Permalink
Re: largest free content website [In reply to]

On 8 July 2011 07:26, John Vandenberg <jayvdb [at] gmail> wrote:
> Is Wikipedia the largest "free content" website?  i.e. website
> consisting primarily of free content.
>
> http://freedomdefined.org/
>
> The only competitors that I can think of are
>
> 1. Project Gutenberg, however they have a few free-gratis etexts
> sprinkled through their collection.
>
> 2. Million Books Project http://www.ulib.org/

In terms of raw data the answer is no. wikimedia commons is larger.

Other than that you are probably mostly looking at US goverment stuff.
The patent office for example.



--
geni

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


meta.sj at gmail

Jul 8, 2011, 8:47 AM

Post #14 of 18 (909 views)
Permalink
Re: largest free content website [In reply to]

Right. NARA has 5 billion pages of PD content online, as I learned
this morning. Is it 'a website'?

The Internet Archive and many others include roughly 2 million PD
books, just under 1 billion pages of text.

Flickr Commons has many times the # of free content photos of WM Commons.

What sets WM projects apart is the amount of curation and collection
management that has gone into it, making the vast majority of the work
both well and consistently categorized, revised and improved where
possible to remove duplicates and mistakes, and sifted to filter up
material that is both useful for general education, and can be cross
connected or linked with other such material [via both internal links
and citations].

In that sense, Wikimedia is the largest online project I know of.

But if we wanted to make wikisource a repository for all free content
licensed material available anywhere online, it would become 100x to
1000x the size of all other projects combined.

SJ

On Fri, Jul 8, 2011 at 3:19 PM, geni <geniice [at] gmail> wrote:
> On 8 July 2011 07:26, John Vandenberg <jayvdb [at] gmail> wrote:
>> Is Wikipedia the largest "free content" website?  i.e. website
>> consisting primarily of free content.
>>
>> http://freedomdefined.org/
>>
>> The only competitors that I can think of are
>>
>> 1. Project Gutenberg, however they have a few free-gratis etexts
>> sprinkled through their collection.
>>
>> 2. Million Books Project http://www.ulib.org/
>
> In terms of raw data the answer is no. wikimedia commons is larger.
>
> Other than that you are probably mostly looking at US goverment stuff.
> The patent office for example.
>
>
>
> --
> geni
>
> _______________________________________________
> foundation-l mailing list
> foundation-l [at] lists
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
>



--
Samuel Klein          identi.ca:sj           w:user:sj          +1 617 529 4266

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


geniice at gmail

Jul 8, 2011, 9:38 AM

Post #15 of 18 (910 views)
Permalink
Re: largest free content website [In reply to]

On 8 July 2011 16:47, Samuel Klein <meta.sj [at] gmail> wrote:
> Right.  NARA has 5 billion pages of PD content online, as I learned
> this morning.  Is it 'a website'?
>

Do you have a cite for that? Could probably be added to:

http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons


--
geni

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


meta.sj at gmail

Jul 8, 2011, 1:40 PM

Post #16 of 18 (910 views)
Permalink
Re: largest free content website [In reply to]

Dominic - this is from the Archivist's speech today. Is there a handy cite?

S

On Fri, Jul 8, 2011 at 4:38 PM, geni <geniice [at] gmail> wrote:
> On 8 July 2011 16:47, Samuel Klein <meta.sj [at] gmail> wrote:
>> Right.  NARA has 5 billion pages of PD content online, as I learned
>> this morning.  Is it 'a website'?
>>
>
> Do you have a cite for that? Could probably be added to:
>
> http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons
>
>
> --
> geni
>
> _______________________________________________
> foundation-l mailing list
> foundation-l [at] lists
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



--
Samuel Klein          identi.ca:sj           w:user:sj          +1 617 529 4266

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


dmcdevit at cox

Jul 8, 2011, 7:16 PM

Post #17 of 18 (900 views)
Permalink
Re: largest free content website [In reply to]

On 7/8/11 4:40 PM, Samuel Klein wrote:
> Dominic - this is from the Archivist's speech today. Is there a handy cite?
>
> S
>
> On Fri, Jul 8, 2011 at 4:38 PM, geni<geniice [at] gmail> wrote:
>> On 8 July 2011 16:47, Samuel Klein<meta.sj [at] gmail> wrote:
>>> Right. NARA has 5 billion pages of PD content online, as I learned
>>> this morning. Is it 'a website'?
>>>
>> Do you have a cite for that? Could probably be added to:
>>
>> http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons

Actually, I think the point that David Ferriero was making, to give you
a sense of the immensity of their digitization struggle, was that that
is the size of their *holdings*.Their digital collections are not even
be in the millions yet; the current official number is 153,000
(documents, so the page count could still be much higher) digitized and
described at the item-level in the catalog, though there may be some
thousands more not in the catalog in online exhibits. They do, of
course, have an increasing number of born-digital documents as well.
It's a huge undertaking. As I mentioned earlier today, only 68% of the
holdings of National Archives are even cataloged, and many of these are
not even item-level descriptions, so they are not even at the point yet
where they know everything they have. Some statistics:
<http://www.archives.gov/research/arc/about-arc.html>.

Dominic

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


meta.sj at gmail

Jul 8, 2011, 9:45 PM

Post #18 of 18 (902 views)
Permalink
Re: largest free content website [In reply to]

On Fri, Jul 8, 2011 at 10:15 PM, Dominic McDevitt-Parks
<mcdevitd [at] gmail> wrote:

>>> On 8 July 2011 16:47, Samuel Klein<meta.sj [at] gmail>  wrote:
>>>>
>>>> Right.  NARA has 5 billion pages of PD content online, as I learned
>>>> this morning.  Is it 'a website'?
>>>>
>>> Do you have a cite for that? Could probably be added to:
>>>
>>> http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons
>
> Actually, I think the point that David Ferriero was making, to give you a
> sense of the immensity of their digitization struggle, was that that is the
> size of their *holdings*.Their digital collections are not even be in the
> millions yet; the current official number is 153,000 (documents, so the page
> count could still be much higher) digitized and described at the item-level
> in the catalog, though there may be some thousands more not in the catalog
> in online exhibits. They do, of course, have an increasing number of
> born-digital documents as well. It's a huge undertaking. As I mentioned
> earlier today, only 68% of the holdings of National Archives are even
> cataloged, and many of these are not even item-level descriptions, so they
> are not even at the point yet where they know everything they have. Some
> statistics: <http://www.archives.gov/research/arc/about-arc.html>.

Aha, that's a handy stats page. So we aren't not totally dwarfed by
other online collections; maybe by a single order of magnitude by the
free-content book collections out there.

SJ

_______________________________________________
foundation-l mailing list
foundation-l [at] lists
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Wikipedia foundation RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.