Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

463 kernel developers missing!

 

 

First page Previous page 1 2 3 4 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded


jonsmirl at gmail

Jul 28, 2008, 1:38 PM

Post #26 of 98 (318 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/28/08, Theodore Tso <tytso [at] mit> wrote:
> On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:
> > Other people aren't perfect, I've found over 1,000 typos in the those
> > names and emails. We need a validation mechanism.
> >
>
>
> You keep using the word "need"; I do not think it means what you think
> it does. :-)
>
> Seriously, why is it so important? It's a nice to have, and I
> recognize that you've spent a bunch of time on it. But if the goal is
> to get better statistics, and in exchange we forcibly map all Mark
> Browns to one e-mail address, and/or force them to all adopt middle
> initials (what if there are two Dan Smith's that don't have middle
> initials) just for the convenience of your statistics gathering, I
> would gently suggest to you that you've forgotten which is the tail,
> and which is the dog.

There are over 1,000 typos in the logs. No validation being done on
the names/addresses in the logs. Many email addresses aren't
syntactically valid. Why not put some checks in place to try and clean
this up? Signed-off-by is worthless if it is full of garbage.

The are two Mark Browns in the file:
Mark Brown <broonie [at] opensource>
Mark Brown <broonie [at] sirena>

I don't know if these are two different people or one person with two
emails. But the file doesn't force that decision. It's git shortlog
that is combining them.

The file serves two purposes:
Map people using multiple email aliases a human single name, It can be
any name they choose. Existing file already does this but the list is
not complete.
Enumerate all email addresses used in the log so that it is possible
to tell when a new address is encountered. Allows simple validation to
be implemented.

In it's current form it doesn't indicate which aliases is the
developer's currently active one.

--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


davej at redhat

Jul 28, 2008, 1:46 PM

Post #27 of 98 (326 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, Jul 28, 2008 at 04:22:36PM -0400, Theodore Tso wrote:
> On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:
> > Other people aren't perfect, I've found over 1,000 typos in the those
> > names and emails. We need a validation mechanism.
> >
>
> You keep using the word "need"; I do not think it means what you think
> it does. :-)
>
> Seriously, why is it so important? It's a nice to have, and I
> recognize that you've spent a bunch of time on it. But if the goal is
> to get better statistics, and in exchange we forcibly map all Mark
> Browns to one e-mail address, and/or force them to all adopt middle
> initials (what if there are two Dan Smith's that don't have middle
> initials) just for the convenience of your statistics gathering, I
> would gently suggest to you that you've forgotten which is the tail,
> and which is the dog.

I'm beginning to question just how useful the continued measuring
of things like Signed-off-by's is. Last week at OLS, I overheard
a conversation where someone was talking about the "top 10" lists
that Greg has been talking about at various conferences.
The conversation went along the lines of "my manager really wants
to see us on that list, at any cost".
Whilst the niave may think 'more patches == more better', this isn't
necessarily the case given we have nowhere near enough review bandwidth
*now*, and flooding with a zillion trivial patches really isn't going
to make that job any easier.

Getting patches into the tree is easy, we've proven that.
As things stand now, it's also fairly easy to 'game' the system
by committing something in 10 changesets when it could be done
just as easily in 2-3.

How about we start measuring things that actually matter, like..

"How many patches were reviewed before they went in"
"How many patches were directly responsible for a bug"
"How many patches actually fixed something anyone cares about"
"How many patches are responsible for just 'churn'"

Dave

--
http://www.codemonkey.org.uk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


randy.dunlap at oracle

Jul 28, 2008, 2:14 PM

Post #28 of 98 (320 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, 28 Jul 2008 16:46:24 -0400 Dave Jones wrote:

> On Mon, Jul 28, 2008 at 04:22:36PM -0400, Theodore Tso wrote:
> > On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:
> > > Other people aren't perfect, I've found over 1,000 typos in the those
> > > names and emails. We need a validation mechanism.
> > >
> >
> > You keep using the word "need"; I do not think it means what you think
> > it does. :-)
> >
> > Seriously, why is it so important? It's a nice to have, and I
> > recognize that you've spent a bunch of time on it. But if the goal is
> > to get better statistics, and in exchange we forcibly map all Mark
> > Browns to one e-mail address, and/or force them to all adopt middle
> > initials (what if there are two Dan Smith's that don't have middle
> > initials) just for the convenience of your statistics gathering, I
> > would gently suggest to you that you've forgotten which is the tail,
> > and which is the dog.
>
> I'm beginning to question just how useful the continued measuring
> of things like Signed-off-by's is. Last week at OLS, I overheard
> a conversation where someone was talking about the "top 10" lists
> that Greg has been talking about at various conferences.
> The conversation went along the lines of "my manager really wants
> to see us on that list, at any cost".
> Whilst the niave may think 'more patches == more better', this isn't
> necessarily the case given we have nowhere near enough review bandwidth
> *now*, and flooding with a zillion trivial patches really isn't going
> to make that job any easier.
>
> Getting patches into the tree is easy, we've proven that.
> As things stand now, it's also fairly easy to 'game' the system
> by committing something in 10 changesets when it could be done
> just as easily in 2-3.
>
> How about we start measuring things that actually matter, like..
>
> "How many patches were reviewed before they went in"
> "How many patches were directly responsible for a bug"
> "How many patches actually fixed something anyone cares about"
> "How many patches are responsible for just 'churn'"

It would be Good if we could give more value to Reviewed-by: tag lines also...

IOW, we "need" to do this. :)


---
~Randy
Linux Plumbers Conference, 17-19 September 2008, Portland, Oregon USA
http://linuxplumbersconf.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jmorris at namei

Jul 28, 2008, 3:01 PM

Post #29 of 98 (317 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, 28 Jul 2008, Randy Dunlap wrote:

> It would be Good if we could give more value to Reviewed-by: tag lines also...
>
> IOW, we "need" to do this. :)

Also, Tested-by:, to encourage and recognize people who may not be
confident in reviewing code to at least test it, which is immensely
useful if done thoughtfully.

"Measuring programming progress by lines of code is like measuring
aircraft building progress by weight."

If you know who said this, award yourself a cookie :-)


- James
--
James Morris
<jmorris [at] namei>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jonsmirl at gmail

Jul 28, 2008, 3:08 PM

Post #30 of 98 (326 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/28/08, Dave Jones <davej [at] redhat> wrote:
> On Mon, Jul 28, 2008 at 04:22:36PM -0400, Theodore Tso wrote:
> > On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:
> > > Other people aren't perfect, I've found over 1,000 typos in the those
> > > names and emails. We need a validation mechanism.
> > >
> >
> > You keep using the word "need"; I do not think it means what you think
> > it does. :-)
> >
> > Seriously, why is it so important? It's a nice to have, and I
> > recognize that you've spent a bunch of time on it. But if the goal is
> > to get better statistics, and in exchange we forcibly map all Mark
> > Browns to one e-mail address, and/or force them to all adopt middle
> > initials (what if there are two Dan Smith's that don't have middle
> > initials) just for the convenience of your statistics gathering, I
> > would gently suggest to you that you've forgotten which is the tail,
> > and which is the dog.
>
>
> I'm beginning to question just how useful the continued measuring
> of things like Signed-off-by's is. Last week at OLS, I overheard
> a conversation where someone was talking about the "top 10" lists
> that Greg has been talking about at various conferences.
> The conversation went along the lines of "my manager really wants
> to see us on that list, at any cost".

I didn't do this to measure statistics, I did it because I was writing
a script and the script was getting garbage for input. It just had the
side effect of cleaning up the statistics.

> Whilst the naive may think 'more patches == more better', this isn't
> necessarily the case given we have nowhere near enough review bandwidth
> *now*, and flooding with a zillion trivial patches really isn't going
> to make that job any easier.
>
> Getting patches into the tree is easy, we've proven that.
> As things stand now, it's also fairly easy to 'game' the system
> by committing something in 10 changesets when it could be done
> just as easily in 2-3.
>
> How about we start measuring things that actually matter, like..
>
> "How many patches were reviewed before they went in"
> "How many patches were directly responsible for a bug"
> "How many patches actually fixed something anyone cares about"
> "How many patches are responsible for just 'churn'"
>

These are good topics for the Plumbers conference. But to ask these
questions we need to get the data into a format where a computer can
process it. Syntax checking, validation, etc are needed on the log
messages. I'm not going to hunt through 100,000 commits trying to
answer these by hand.

Another fun experiment would be to load an archive of LKML, kernel
bugzilla and the kernel source history into git and then try to link
everything together. The cleaner the data is, the easier it will be to
link things. How about a GUI where each patch is annotated with a link
to the email thread discussing it?

--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


tytso at mit

Jul 28, 2008, 3:32 PM

Post #31 of 98 (321 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, Jul 28, 2008 at 06:08:33PM -0400, Jon Smirl wrote:
> I didn't do this to measure statistics, I did it because I was writing
> a script and the script was getting garbage for input. It just had the
> side effect of cleaning up the statistics.

Out of curiosity, what is your script trying to do?

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


randy.dunlap at oracle

Jul 28, 2008, 3:38 PM

Post #32 of 98 (317 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, 28 Jul 2008 18:32:41 -0400 Theodore Tso wrote:

> On Mon, Jul 28, 2008 at 06:08:33PM -0400, Jon Smirl wrote:
> > I didn't do this to measure statistics, I did it because I was writing
> > a script and the script was getting garbage for input. It just had the
> > side effect of cleaning up the statistics.
>
> Out of curiosity, what is your script trying to do?


Speaking of missing developers, I'd be more interested in whatever
happened to Michal Piotrowski, Satyam Sharma, et al...


---
~Randy
Linux Plumbers Conference, 17-19 September 2008, Portland, Oregon USA
http://linuxplumbersconf.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


stefanr at s5r6

Jul 28, 2008, 3:38 PM

Post #33 of 98 (321 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

Jon Smirl wrote:
> Another fun experiment would be to load an archive of LKML, kernel
> bugzilla and the kernel source history into git and then try to link
> everything together.

Another fun experiment: Fetch 10 open bugs in bugzilla which may affect
your hardware, try to reproduce one of them, fix it.
--
Stefan Richter
-=====-==--- -=== ===-=
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jonsmirl at gmail

Jul 28, 2008, 3:52 PM

Post #34 of 98 (325 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/28/08, Theodore Tso <tytso [at] mit> wrote:
> On Mon, Jul 28, 2008 at 06:08:33PM -0400, Jon Smirl wrote:
> > I didn't do this to measure statistics, I did it because I was writing
> > a script and the script was getting garbage for input. It just had the
> > side effect of cleaning up the statistics.
>
>
> Out of curiosity, what is your script trying to do?

I was trying to locate my patches in other private trees that were
ready for deletion. I wanted to make sure there wasn't something good
that I had forgotten about. I processed the output from 'git log' and
got tripped up matching the author field because is was full of junk.
My database background kicked in and I found myself on a tangent
cleaning up the data.

I have since learned about the existence of 'git shortlog' which
solved my problem. But I had already cleaned up the data before
finding it.

>
> - Ted
>


--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


lethal at linux-sh

Jul 28, 2008, 4:41 PM

Post #35 of 98 (325 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Tue, Jul 29, 2008 at 08:01:09AM +1000, James Morris wrote:
> On Mon, 28 Jul 2008, Randy Dunlap wrote:
>
> > It would be Good if we could give more value to Reviewed-by: tag lines also...
> >
> > IOW, we "need" to do this. :)
>
> Also, Tested-by:, to encourage and recognize people who may not be
> confident in reviewing code to at least test it, which is immensely
> useful if done thoughtfully.
>
> "Measuring programming progress by lines of code is like measuring
> aircraft building progress by weight."
>
> If you know who said this, award yourself a cookie :-)
>
Or just filter on "-by:", which seems to get anything relevant, including
people that shamelessly make up their own tags. In order for something to
be converted from a Cc: to a *-by: requires manual effort at least, which
ought to be sufficient for recognition.

If someone was really bored they could probably make a table of tags with
various points to try and balance things slightly more objectively.
Though it seems we now at least have totally different metrics on LWN,
for the kernel summit selection process, and Jon's new script. ;-)

Trying to map all of the names seems pretty pointless though, most
regular contributors contribute in a fairly consistent and sane manner,
with the odd mismatch or typo here or there. It might make sense for
anyone where there's a significant difference, but those are going to be
corner cases.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jonsmirl at gmail

Jul 28, 2008, 5:14 PM

Post #36 of 98 (318 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/28/08, Paul Mundt <lethal [at] linux-sh> wrote:
> On Tue, Jul 29, 2008 at 08:01:09AM +1000, James Morris wrote:
> > On Mon, 28 Jul 2008, Randy Dunlap wrote:
> >
> > > It would be Good if we could give more value to Reviewed-by: tag lines also...
> > >
> > > IOW, we "need" to do this. :)
> >
> > Also, Tested-by:, to encourage and recognize people who may not be
> > confident in reviewing code to at least test it, which is immensely
> > useful if done thoughtfully.
> >
> > "Measuring programming progress by lines of code is like measuring
> > aircraft building progress by weight."
> >
> > If you know who said this, award yourself a cookie :-)
> >
>
> Or just filter on "-by:", which seems to get anything relevant, including
> people that shamelessly make up their own tags. In order for something to
> be converted from a Cc: to a *-by: requires manual effort at least, which
> ought to be sufficient for recognition.
>
> If someone was really bored they could probably make a table of tags with
> various points to try and balance things slightly more objectively.
> Though it seems we now at least have totally different metrics on LWN,
> for the kernel summit selection process, and Jon's new script. ;-)
>
> Trying to map all of the names seems pretty pointless though, most
> regular contributors contribute in a fairly consistent and sane manner,
> with the odd mismatch or typo here or there. It might make sense for
> anyone where there's a significant difference, but those are going to be
> corner cases.

12% of the name/email pairs are messed up. It's not all simple typos.
There is significant mangling of non ASCII charsets by people's tools
in the maintainer's chain of processing. Half of the time I don't
believe what the author is submitting is what is ending up in the log
due to mangling. It's a larger source of noise than typos.

All of these variations on email names are in the log. Humans can
identify these problems, it is much harder for a machine.

For example, where are these backslashes coming from?
Auke-Jan H Kok <auke-jan.h.kok [at] intel>
Auke-Jan H Kok <auke\-jan.h.kok [at] intel>
Auke-Jan H Kok <auke\\-jan.h.kok [at] intel>
Auke-Jan H Kok <auke\\\-jan.h.kok [at] intel>
Auke-Jan H Kok <sofar [at] foo-projects>

Are the tools case sensitive or insensitive on email addresses? Some
are are some aren't, so I need these cases...
Al Viro <viro [at] zeniv>
Al Viro <viro [at] zenIV>
Al Viro <viro [at] ZenIV>

Another problem is internal machine names...
David S. Miller <davem [at] sunset>
David S. Miller <davem [at] davemloft>
David S. Miller <davem [at] huronp11>
David S. Miller <davem [at] hutch>
David S. Miller <davem [at] bnsf>
David S. Miller <davem [at] t1000>
David S. Miller <davem [at] ultra5>
David S. Miller <davem [at] goma>

Or varying the email name...
Alexey Starikovskiy <alexey.y.starikovskiy [at] intel>
Alexey Starikovskiy <alexey_y_starikovskiy [at] linux>
Alexey Starikovskiy <alexey.y.starikovskiy [at] linux>

Why do these all end in (none)?
Craig Hughes <craig [at] com(none)>
Dave Neuer <dneuer [at] org(none)>
David Brownell <david-b [at] net(none)>
David Woodhouse <dwmw2 [at] org(none)>
Deepak Saxena <dsaxena [at] net(none)>
Enrico Scholz <enrico.scholz [at] de(none)>

--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


rene.herman at keyaccess

Jul 28, 2008, 5:29 PM

Post #37 of 98 (324 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 29-07-08 02:14, Jon Smirl wrote:

> Why do these all end in (none)?
> Craig Hughes <craig [at] com(none)>
> Dave Neuer <dneuer [at] org(none)>
> David Brownell <david-b [at] net(none)>
> David Woodhouse <dwmw2 [at] org(none)>
> Deepak Saxena <dsaxena [at] net(none)>
> Enrico Scholz <enrico.scholz [at] de(none)>

Because rmk rewrites addresses to comply with privacy laws. Another good
example of why this nonsense of yours is exactly that.

I checked and am personally in there three times, once even without any
valid email address listed. And any time there's anything other than my
gmail address in some submission it at least recently means that someone
_else_ took my from: address and stuck it on there and while I don't
terribly mind that generally, I find it really annoying to see even
those mistakes harvested into your hugely google-accessible resource.

This is just yet another example of the senseless robotic crap people
people just insist is "needed" and "valueable", but which is neither.

Nonsense it is.

Rene.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


lethal at linux-sh

Jul 28, 2008, 5:33 PM

Post #38 of 98 (321 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Tue, Jul 29, 2008 at 02:29:01AM +0200, Rene Herman wrote:
> On 29-07-08 02:14, Jon Smirl wrote:
>
> >Why do these all end in (none)?
> >Craig Hughes <craig [at] com(none)>
> >Dave Neuer <dneuer [at] org(none)>
> >David Brownell <david-b [at] net(none)>
> >David Woodhouse <dwmw2 [at] org(none)>
> >Deepak Saxena <dsaxena [at] net(none)>
> >Enrico Scholz <enrico.scholz [at] de(none)>
>
> This is just yet another example of the senseless robotic crap people
> people just insist is "needed" and "valueable", but which is neither.
>
Speaking of which, lk-changelog did the same sort of thing back in the BK
days, which was at least useful for generating a pretty short log.
Perhaps it makes more sense to start from that if someone really wants to
waste their time on this. I'm still not sure what the point is though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jonsmirl at gmail

Jul 28, 2008, 5:50 PM

Post #39 of 98 (320 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/28/08, Rene Herman <rene.herman [at] keyaccess> wrote:
> On 29-07-08 02:14, Jon Smirl wrote:
>
>
> > Why do these all end in (none)?
> > Craig Hughes <craig [at] com(none)>
> > Dave Neuer <dneuer [at] org(none)>
> > David Brownell <david-b [at] net(none)>
> > David Woodhouse <dwmw2 [at] org(none)>
> > Deepak Saxena <dsaxena [at] net(none)>
> > Enrico Scholz <enrico.scholz [at] de(none)>
> >
>
> Because rmk rewrites addresses to comply with privacy laws. Another good
> example of why this nonsense of yours is exactly that.
>
> I checked and am personally in there three times, once even without any
> valid email address listed. And any time there's anything other than my
> gmail address in some submission it at least recently means that someone
> _else_ took my from: address and stuck it on there and while I don't
> terribly mind that generally, I find it really annoying to see even those
> mistakes harvested into your hugely google-accessible resource.

The emails in the list are extracted from the commit log. I did not
touch the emails. If your email is in there wrong it is in a log
message wrong. That doesn't necessarily mean you are the person who
put it into the log wrong, patches can get mangled when being passed
along the maintainer chain. The point of this file is to turn the
mistake back into something useful. Think of these are reverse
mappings, they convert errors back to usable names.

As for privacy, if you don't want your email address in a file like
this don't put it into a GPL'd public project. Generate a random name
and email for each patch you submit. Of course I'm having trouble with
a Signed-off-by: that can't be turned back into a person.
Signed-off-by is there to track the responsibility chain for a patch
and if the chain has been obfuscated what good is it?

> This is just yet another example of the senseless robotic crap people
> people just insist is "needed" and "valueable", but which is neither.
>
> Nonsense it is.
>
> Rene.
>


--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


viro at ZenIV

Jul 28, 2008, 6:15 PM

Post #40 of 98 (317 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:

> Other people aren't perfect, I've found over 1,000 typos in the those
> names and emails. We need a validation mechanism.

Who's "we", luser, and why would I possibly give a damn for your needs?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jonsmirl at gmail

Jul 28, 2008, 6:25 PM

Post #41 of 98 (320 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/28/08, Al Viro <viro [at] zeniv> wrote:
> On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:
>
>
> > Other people aren't perfect, I've found over 1,000 typos in the those
> > names and emails. We need a validation mechanism.
>
> Who's "we", luser, and why would I possibly give a damn for your needs?

Let's drop the whole Sign-off-by mechanism. If we can't be bothered to
clean up the junk in Signed-off-by why should we bother recording
them? Sign every patch Mickey Mouse, it has the same effect.

--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


viro at ZenIV

Jul 28, 2008, 6:36 PM

Post #42 of 98 (320 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, Jul 28, 2008 at 09:25:39PM -0400, Jon Smirl wrote:
> On 7/28/08, Al Viro <viro [at] zeniv> wrote:
> > On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:
> >
> >
> > > Other people aren't perfect, I've found over 1,000 typos in the those
> > > names and emails. We need a validation mechanism.
> >
> > Who's "we", luser, and why would I possibly give a damn for your needs?
>
> Let's drop the whole Sign-off-by mechanism. If we can't be bothered to
> clean up the junk in Signed-off-by why should we bother recording
> them? Sign every patch Mickey Mouse, it has the same effect.

That still doesn't answer either of my questions. As for your question, the
point is to have them good enough to make an individual changeset feasible
to track.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jonsmirl at gmail

Jul 28, 2008, 7:01 PM

Post #43 of 98 (312 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/28/08, Al Viro <viro [at] zeniv> wrote:
> On Mon, Jul 28, 2008 at 09:25:39PM -0400, Jon Smirl wrote:
> > On 7/28/08, Al Viro <viro [at] zeniv> wrote:
> > > On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:
> > >
> > >
> > > > Other people aren't perfect, I've found over 1,000 typos in the those
> > > > names and emails. We need a validation mechanism.
> > >
> > > Who's "we", luser, and why would I possibly give a damn for your needs?
> >
> > Let's drop the whole Sign-off-by mechanism. If we can't be bothered to
> > clean up the junk in Signed-off-by why should we bother recording
> > them? Sign every patch Mickey Mouse, it has the same effect.
>
> That still doesn't answer either of my questions. As for your question, the
> point is to have them good enough to make an individual changeset feasible
> to track.

The file lets you convert the mess that exists in the log file xx-by:
fields back into something reasonable. The messed up email addresses
are verbatim extracted from the log. There is one entry in the file
for each email address that appears in the log. The real names have
been fixed by script and hand to correspond a real name with the
extracted emails.

Now we will differ on the definition of feasible and whether we should
work to prevent more messed up emails/names from getting into the log.
That's the central question here, how much are you allowed to
obfuscate (on purpose or accidentally) your identity in an xx-by?

I should also point out that external information (Google) was needed
to identify several hundred names, there was insufficient information
in the log or kernel source. If we have to reconstruct this mapping
ten years from now for some random lawsuit, the external information
may not be there.

--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


tytso at mit

Jul 28, 2008, 7:50 PM

Post #44 of 98 (313 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, Jul 28, 2008 at 10:01:06PM -0400, Jon Smirl wrote:
> I should also point out that external information (Google) was needed
> to identify several hundred names, there was insufficient information
> in the log or kernel source. If we have to reconstruct this mapping
> ten years from now for some random lawsuit, the external information
> may not be there.

Jon,

The reality is ten years from now, many e-mail addresses won't
be accurate anyway. We will have to track people down by hand, if it
ever comes down to that. The signed-off-by needs to be enough so we
can track down someone (very likely only a few set of people); via a
manual method is quite acceptable. I don't think it is really
necessary to try force fit the signed-off-by just so we can collect
better mode.

It should also be noted that the Developer's Certification of
Origin 1.1 has laguage that was designed to make it legal to collect
the DCO lines even in the European Union. So what rmk is doing is
strictly speaking not necessary.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jonsmirl at gmail

Jul 28, 2008, 8:23 PM

Post #45 of 98 (319 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/28/08, Theodore Tso <tytso [at] mit> wrote:
> On Mon, Jul 28, 2008 at 10:01:06PM -0400, Jon Smirl wrote:
> > I should also point out that external information (Google) was needed
> > to identify several hundred names, there was insufficient information
> > in the log or kernel source. If we have to reconstruct this mapping
> > ten years from now for some random lawsuit, the external information
> > may not be there.
>
>
> Jon,
>
> The reality is ten years from now, many e-mail addresses won't
> be accurate anyway. We will have to track people down by hand, if it
> ever comes down to that. The signed-off-by needs to be enough so we
> can track down someone (very likely only a few set of people); via a
> manual method is quite acceptable. I don't think it is really
> necessary to try force fit the signed-off-by just so we can collect
> better mode.

The kernel already has a mailmap file, but it is not complete. So I
should just take this work that makes the mailmap file a lot better
and throw it away? The policy is that the log file should be messed up
enough so that a computer can't process it and that a human can
recover it only with several day's effort? That's a really hard line
to define and we'll probably lose the identity of a bunch of
contributors. I'll follow up with a patch that deletes the current
.mailmap

--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


tytso at MIT

Jul 28, 2008, 9:13 PM

Post #46 of 98 (320 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, Jul 28, 2008 at 11:23:31PM -0400, Jon Smirl wrote:
> The kernel already has a mailmap file, but it is not complete. So I
> should just take this work that makes the mailmap file a lot better
> and throw it away? The policy is that the log file should be messed up
> enough so that a computer can't process it and that a human can
> recover it only with several day's effort? That's a really hard line
> to define and we'll probably lose the identity of a bunch of
> contributors. I'll follow up with a patch that deletes the current
> .mailmap

Personally, I have no objection to the mailmap file as it's on the
whole an improvement; if it's been automatically generated and it
falsely maps multiple people to a single person, that would be highly
unfortunate, but maybe it fixes more problems than it creates.

I think the part most people are seriously objecting to is that the
supposition that Linus and some of his top lieutenants should be
enforcing some arbitrary rule that rejects commits if they come from
addresses outside of your .mailmap file (unless they first send a
patch to add their e-mail address to the .mailmap file), in some kind
of misguided attempt to enforce validation, which apparently the main
justification for which is so that you and others can runs some
statistical analysis, of which there seems to be some dispute whether
or not encouraging people to compete to get into the top 20
signed-off-by by splitting up commits into 100 different micro-patches
should be considered a desirable side effect of said statistical
analysis.

As I said earlier, the moment you started advocating enforcing
validation, you may have started to confuse which is the tail and
which is the dog. People should be supplying patches to improve the
kernel; not to provide accurate fodder for statistical analysis.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


tytso at MIT

Jul 28, 2008, 9:15 PM

Post #47 of 98 (315 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Tue, Jul 29, 2008 at 12:13:37AM -0400, Theodore Tso wrote:
> Personally, I have no objection to the mailmap file as it's on the
> whole an improvement; if it's been automatically generated and it
> falsely maps multiple people to a single person, that would be highly
> unfortunate, but maybe it fixes more problems than it creates.

Typo correction. The first part of that sentence should read:

"Personally, I have no objection to the mailmap file IF on the
whole it's an improvement...."

> I think the part most people are seriously objecting to is that the
> supposition that Linus and some of his top lieutenants should be
> enforcing some arbitrary rule that rejects commits if they come from
> addresses outside of your .mailmap file (unless they first send a
> patch to add their e-mail address to the .mailmap file), in some kind
> of misguided attempt to enforce validation, which apparently the main
> justification for which is so that you and others can runs some
> statistical analysis, of which there seems to be some dispute whether
> or not encouraging people to compete to get into the top 20
> signed-off-by by splitting up commits into 100 different micro-patches
> should be considered a desirable side effect of said statistical
> analysis.
>
> As I said earlier, the moment you started advocating enforcing
> validation, you may have started to confuse which is the tail and
> which is the dog. People should be supplying patches to improve the
> kernel; not to provide accurate fodder for statistical analysis.
>
> - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


jonsmirl at gmail

Jul 28, 2008, 10:05 PM

Post #48 of 98 (311 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On 7/29/08, Theodore Tso <tytso [at] mit> wrote:
> On Mon, Jul 28, 2008 at 11:23:31PM -0400, Jon Smirl wrote:
> > The kernel already has a mailmap file, but it is not complete. So I
> > should just take this work that makes the mailmap file a lot better
> > and throw it away? The policy is that the log file should be messed up
> > enough so that a computer can't process it and that a human can
> > recover it only with several day's effort? That's a really hard line
> > to define and we'll probably lose the identity of a bunch of
> > contributors. I'll follow up with a patch that deletes the current
> > .mailmap
>
>
> Personally, I have no objection to the mailmap file as it's on the
> whole an improvement; if it's been automatically generated and it
> falsely maps multiple people to a single person, that would be highly
> unfortunate, but maybe it fixes more problems than it creates.

The mapping multiple people to a single person problem was always
there, the new mailmap file doesn't alter it. There simply isn't
enough information in the kernel source to tell if there are two or
one Mark Browns. The file would need to be extended to encode more
information.

Mark Brown <broonie [at] opensource>
Mark Brown <broonie [at] sirena>

If the Marks want to separate themselves they will need to alter the
mailmap. With the new mailmap this is easily done. With the old one
you would have need to identify all of the aliases first.

It's the higher level tools that are combining these into a single person.

> I think the part most people are seriously objecting to is that the
> supposition that Linus and some of his top lieutenants should be
> enforcing some arbitrary rule that rejects commits if they come from
> addresses outside of your .mailmap file (unless they first send a
> patch to add their e-mail address to the .mailmap file), in some kind
> of misguided attempt to enforce validation, which apparently the main
> justification for which is so that you and others can runs some
> statistical analysis, of which there seems to be some dispute whether
> or not encouraging people to compete to get into the top 20
> signed-off-by by splitting up commits into 100 different micro-patches
> should be considered a desirable side effect of said statistical
> analysis.

That whole thread was pointless, the scripts for doing validation
don't exist. The stat tools are helpful in finding errors in the
mailmap file. I never cared about the stat results, I already know who
the top developers are. Let's drop the whole validation concept too
since it is obviously upsetting people.

There are two types of entries in the file. Ones that alter the names
associated with an email and ones that don't. You could argue that the
ones that don't alter the names aren't needed. They're in there to
make maintenance on the file easier.

Putting all emails in the file lets you do maintenance by extracting
the complete list of emails from the log and then removing the ones
already in the file. Now you only have to manually check these new
emails. If the unchanged entries were removed from the file they'd get
mixed in with the new emails. Each time you updated mailmap you'd have
a couple thousand emails to check.

Putting the unchanged entries in the file also makes it very easy for
people who want to alter their name entry. Just edit the mailmap file.
Everything is there and sorted by name. Change the name for all of
your aliases to whatever you want. Just make sure the names are all
identical on the aliases.

> As I said earlier, the moment you started advocating enforcing
> validation, you may have started to confuse which is the tail and
> which is the dog. People should be supplying patches to improve the
> kernel; not to provide accurate fodder for statistical analysis.

These addresses have more purposes than statistical analysis. They
also record the responsibility chain of who submitted the patch. It
seems prudent to me that we should make some effort to attempt to keep
that chain in a reasonably clean state.

I believe that people can get their name/email right in a patch 99% of
the time. The bulk of the 12% error rate appears to be coming from
maintainer tools mangling the patches and exposed internal mail server
names. The real message is that there are some tools that need to be
fixed.

--
Jon Smirl
jonsmirl [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


w at 1wt

Jul 28, 2008, 10:19 PM

Post #49 of 98 (314 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Mon, Jul 28, 2008 at 11:45:29AM -0400, Jon Smirl wrote:
> On 7/28/08, Adrian Bunk <bunk [at] kernel> wrote:
> > You count merges as patches.
>
> I just used the output from git shortlog, is there a better way?

git shortlog --no-merges

Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


nickpiggin at yahoo

Jul 29, 2008, 2:58 AM

Post #50 of 98 (314 views)
Permalink
Re: 463 kernel developers missing! [In reply to]

On Tuesday 29 July 2008 06:46, Dave Jones wrote:
> On Mon, Jul 28, 2008 at 04:22:36PM -0400, Theodore Tso wrote:
> > On Mon, Jul 28, 2008 at 03:00:13PM -0400, Jon Smirl wrote:
> > > Other people aren't perfect, I've found over 1,000 typos in the those
> > > names and emails. We need a validation mechanism.
> >
> > You keep using the word "need"; I do not think it means what you think
> > it does. :-)
> >
> > Seriously, why is it so important? It's a nice to have, and I
> > recognize that you've spent a bunch of time on it. But if the goal is
> > to get better statistics, and in exchange we forcibly map all Mark
> > Browns to one e-mail address, and/or force them to all adopt middle
> > initials (what if there are two Dan Smith's that don't have middle
> > initials) just for the convenience of your statistics gathering, I
> > would gently suggest to you that you've forgotten which is the tail,
> > and which is the dog.
>
> I'm beginning to question just how useful the continued measuring
> of things like Signed-off-by's is. Last week at OLS, I overheard
> a conversation where someone was talking about the "top 10" lists
> that Greg has been talking about at various conferences.
> The conversation went along the lines of "my manager really wants
> to see us on that list, at any cost".
> Whilst the niave may think 'more patches == more better', this isn't
> necessarily the case given we have nowhere near enough review bandwidth
> *now*

This is one way of looking at "the problem". The other way to look at
it is that things are merged too quickly / without enough review, etc.

That is the problem kernel maintainers can actually do something about.
Or, they can just whine about "not enough review bandwidth".

There has been this complaining from lots of people about not enough
review bandwidth for quite a few years now. So I doubt it is going to
magically get better by making more noise.

Consider that there is probably virtually limitless amount of crap that
people want to try to merge, so there is always going to be a lack of
review bandwidth if the aim is to merge as much as we possibly can as
fast as we can.

The answer is to not make the problem worse by merging stuff faster
than can be reviewed. When that happens, developers and companies
should eventually assign a higher value to patch review.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First page Previous page 1 2 3 4 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.