Just had one of those moments....
<see icon> This seems to be prevalent everywhere today. Politics, coding, manglement, sales, etc. and not limited just to IT.
Git, distributed version control software used by developers to manage source code, includes a command to generate what's known as a pull request, which provides developers with a way to share changes they've made to their copy of a project with the upstream version. For example, were you to want to contribute to Kubernetes, …
If you have a good name for yourself and are respected within the community, it is almost like a guarantee that you wont just make a quick (although well formed) change and then run and not stick around to maintain it.
I think you get this a lot in the ports communities (FreeBSD, OpenBSD). They don't just want a port of some software; they want to know that you will stick around and take up the role of maintainer for the foreseeable future.
There is the "do it badly" approach: if your reputation is sufficient to get away with it, you consciously commit "quick and dirty" in the expectation that it'll draw in others to Do It Right, and start getting more seriously involved in the community from there.
Commentards have already pointed out that quality as measured by the tool used may bear little relation to objective quality or to a project's own guidelines.
The other crucial consideration is where pull requests feature in a project's workflow. If a project decrees that all contributions (including the core team) happen through pull requests - a workflow making the pull request the focus of code review - then you'll get a high acceptance rate based on most of them being 'core'. Perhaps the study should've correlated those acceptance rates with the proportion of all code commits coming through pull requests?
It's not just about sticking around, it is about quality too. And about having to deserve a reputation, a reputation which is about sticking around, but also about quality and usefulness.
There's nothing unexpected about this, actually it is fully to be expected. This is a very good example of the purpose reputation serves. Frankly anyone surprised by this outcome basically isn't qualified to do the research...
Reputation allows us to rely on the work of somebody else without basically having to check them every step of the way. Ideally (and it's never flawless and might even fail big time) someones reputation is well deserved and actually tells you more about the quality/usefulness of his code then a code checker will. For better of worse it is just the way humans interact with each other all the time. How you could think open source projects are somehow an exception is beyond me.
For me, I'd take a pull request from someone I trust to be able to write good and useful code over stuff from a random stranger or somebody known to get it wrong any day, even if the code-quality tool tells me otherwise. Unless code quality tools improve very very very much reputation will be a more reliable measurement of quality.
Alternatively one can of course spend a lot of time actually doing in-depth reviews of pull requests. That would make a difference, but takes huge amounts of time and it often it's just not worth it.
"Perhaps this tool isn't that great. I've seen this sort of thing be crap at actually judging code quality."
I can think of a number of reaons why a pull request should not be accepted...
a) use of global variables for no good reason. I had someone submit a patch that did this to a simple utility I wrote, years before github actually. I basically shelved it. NOT polluting the desgn.
b) use of hard tabs. *NO* - just *NO* (it doesn't display the same on every editor, duh)
the only exception are when required by the file format, like make utilities
c) violating basic code standard. If everything is ALLMAN STYLE, don't you DARE put a K&R style 'if()' block in there.
d) single character variable names - NO
e) anything whitesmith's style - double indenting is an irritation and an eyesore
f) you didn't include comments to explain anything? Even if it's trivial PUT FRICKING COMMENTS. The next confused person might be YOU 5 years from now, going "what was I thinking?"
g) code is just "unreadable" by people like me, who don't like looking at alphabet soup code and having to 'figure things out' instead of getting work done. Do the work up front, use comments and format it in a readable and consistent manner.
h) fears using 'goto' when it's not only practical, it's more efficient to use it. You find a lot of 'goto' in kernel code, worthy of mention.
Now... how does that tool do on THOSE things ???
"other factors such as the reputation of the maintainer and the importance of the feature delivered might be more important than code quality in terms of pull request acceptance"
I didn't see anything which suggested that any of the code being accepted was actually bad. In that case then yes, I would expect that an important feature that users were waiting for would get priority over an unimportant change.
And I would also expect that a change that didn't much value would get rejected, regardless of how well formatted and styled it was.
Indeed, I'm not happy with 'research' that replies on a cheap and ready tool to do the judgements. That's not research, that's arithmetic.
"... describe their analysis of 28 Java open-source projects, which included 4.7m code quality issues in 36,000 pull requests."
So the code was bad because some tool said it was? Did the units tests run? Other tests? Were there further PRs required to fix actual issues from the studies PRs? Did they track their assumptions with a small subset, say 1/200th, to validate their criteria? Shallow research reaches shallow findings.
There is no mention of how many changes did an individual pull request gone through. Assuming the author is good with communication skills, they would be able to accommodate changes (requested by other contributors or perhaps project maintainers) into their pull request and there would be some healthy discussion around it, which obviously increases chances of the PR being applied. I gather there is a strong positive correlation of being able to discuss and amend your change, and being an active member of the community, for reasons which are outside of the programming skill alone.
That's actually a good point. A lot of academic research is a bit shoddy - what they get rewarded for is volume of output, not diligence and polish. Nevertheless, its point is interesting enough to be examined more closely... and presenting an interesting point, even if it may turn out to be wrong, is good enough to get coverage in The Register, too. And so on...
Perhaps the people who measure quality are defining quality wrong.
People may rank something as high quality but that does not mean it is high quality if they don't know what high quality is. Quite clearly git people are among those who define quality wrong.
I wonder if those conducting the survey made a distinction between those projects that only accept pull requests on quarantine branch and those accept pull requests direct to the main branch. The former MO allows for use of procedures and tools to intercept and correct such deficiencies in submitted contributions.
The significant factor here should be that there was a much higher correlation of acceptance with reputation than there was with code quality.
Even if code quality isn’t being measured perfectly, it is a metric that is much better than no metric. You may be able to argue particular data points (but damn few, if PMD’s reputation is even halfway correct), but the trends are inescapable. If you’ve got a better metric, feel free to repeat the study using yours.
5 years ago now, my supplier pulled back from using gcc, switched to a private in-house fork. The stated reason was difficulty getting upstream changes accepted. (The issues were critical for them, but not critical for gcc) The process effectively involved getting your changes accepted by a sponsor, then having the sponsor submit the pull request. With something like a 12 month timeline even if you could get a "high-reputation" person interested.
Doesn't look like much has changed.
Biting the hand that feeds IT © 1998–2020