[Dev] 27000 errors in the Tizen operating system
c.haitzler at samsung.com
Mon Jul 17 04:17:39 UTC 2017
On Fri, 14 Jul 2017 17:19:36 +0300
Andrey Karpov <karpov at viva64.com> wrote:
> Hello Carsten,
> After publishing my article, several unreasonable news appeared on
> the Internet. That's why, I'd like to note that in my article I
> didn’t write about the bad/good quality of Tizen code or that
> PVS-Studio analyzer is magic, best of the best. I only gave those
> numbers that I’d got. I can't talk about the quality of Tizen code,
> since I have insufficient data for this purpose. I understand clearly
> what you are talking about and agree with you.
Indeed that is one reason I ask for numbers that can be compared
because when you publish some number, and that number is large (in this
case due to the fact that codebase you extrapolated for is large) ... a
lot of people don't see the forest from the trees.
Don't get me wrong. Code quality is important. Fixing bugs is
important. Tools to point out "here is a potential issue" are very
useful. But you do need to stand back and see the possibly urgency in
light of everything else and that requires comparative numbers.
> My objective was to show that despite the already used techniques,
> PVS-Studio analyzer can help to make Tizen code better and more
> reliable. As I think, I managed to demonstrate this by pointing to
> the 900 fragments of code, which, in my opinion, deserve attention,
> fixing and refactoring.
At this point, for me at least, the jury is out on PVS Studio. We have
internal tools and we're using those first at this point on Tizen's
codebase. If PVS Studio is better or not than those tools, I'm not
going to comment, as I am not in a good position to do so.
For the upstream projects tizen depends on that I look after, I'm VERY
familiar with Coverity scan and to a lesser extent clang's static
analyser. I can say the text reports PVS studio produce pale in
comparison to what Coverity provides in the Web UI. It provides code
flow analysis (what code path was used to get there) which is very
useful in learning the "why", A full browsable view of the code (so I
don't keep having to match it back to the source file/line when
reading), and a collaborative environment where many deveopers can work
together live on the issue list and see what is triaged by who, whenand
a log about it, BUT it offers something I have not seen so far in the
information you provided for PVS Studio on the blog references here,
... the ability to say "No. False positive. Tool - you're wrong. Here's
why". Coverity lets you do this... which lets you move on and not
modify code that works fine and is correct.
My experience shows that a decent percentage of "fixes for warnings
from coverity" lead to ADDING bugs. I've seen it happen many times. It
made things WORSE. It went from a "theoretical bug but actually due to
environment will never happen" to "I wake up, update code and suddenly
app x, y, z are broken in some way and it was a fix for a warning".
This is also why I'm wary of having people unfamiliar with code "fix
it" based on such warnings. Even people who know the code well mess up.
> Unfortunately, the question "Include this as a bugs per 1 k lines of
> code or similar metric?" is not very clear.
It's simple. "This code has an average of 0.123 bugs per 1k lines of
code". But measure that on everything you analyse. You at least have a
ROUGH measuring stick. They may be all major exploitable issues, maybe
all minor unlikely ones or maybe some combination so a rating system
ultimately would score them based on that, BUT let's keep it simple. If
you quote a number, quote numbers for everything you look at and to
make it comparable... make it proportional to size.
> In my opinion, the article presents all the necessary data. I've got:
> * The density of detected errors in code (c) 2015 Samsung
> Electronics: 0.41 errors on 1000 lines of code.
Your numbers say it's 0.375. 900 in 2.4 million lines of code.
900/2400 = 0.375. The 0.41 is including false positives? Now we have a
report disagreeing with itself. It's small, but important.
OK - found it now. I missed that paragraph towards the end (it is a
long article), but these numbers disagrees with others you give right at
the top and in the headline (the headline number is an extrapolation).
Why do your numbers differ?
> * The density of detected errors in the third-party libraries: 0.36
> errors on 1000 lines of code.
> (I did not consider comments as the lines of code).
> Can these data be incorrect? Yes, they can. This is not a scientific
> research, this is a demonstration on practice that the tool may be
> Moreover, some errors, in Tizen developers’ opinion, canbe not that
> erroneous. Well, at least, there is no sense to fix them. Then the
> density of detected errors will diminish.
This is another issue entirely - if the error is "worth worrying about".
> On the other hand, I might highlight not all the errors. I approached
> to the study of the report very carefully, but without bigotry. For
> example, I was a bit lazy to study the warnings V730 -
> https://www.viva64.com/en/w/V730/. This is a very time consuming and
> thankless work, when you work with someone else's code. All the time
> it is unclear, if it is dangerous or not, that some class member has
> been left uninitialized. It is a tedious long labour that needs to be
> done carefully. So, perhaps, with more careful reviewing of the log,
> other errors can be found.
Indeed it is time consuming to examine them all. I think you did a good
job at explaining what errors are there and why in the general sense
these are potential issues.
> About the comparison with the quality of other projects... Difficult
> question. I ask to understand, that while writing the articles we
> don’t have the aim to compare which code is better or
> worse.Therefore, we usually stop when found enough of quite
The problem is - when you publish numbers like you did, it becomes a
comparison thing. That's how people respond. Headlines like "27000
bugs found in Tizen!" are very much designed to catch attention with
numbers but when there is no context provided... the only thing people
see is a headline.
> interesting bugs for writing an article. Performing the careful
> analysis of all warnings for a large project will take a lot of time.
> I also ask to consider that it is difficult and long to deal with
> unknown code. Therefore, we sometimes mention about density of errors
> only for small projects, since is not very difficult to view the
> whole report. Examples:
> * Notepad++: we detect about 2 errors per 1000 lines of code.
WOW. that's a lot. Thanks for the number.
> * Far Manager for Linux: we detect about 0.464 errors per 1000 lines
> of code. https://www.viva64.com/en/b/0478/
Thanks. your numbers are in line with what coverity indicates too.
> * Tor project: we do not find anything. Density
Indeed that is good. I also noted openssl last i checked on coverity
scan had a density of 0 too.
> As we can see, the results are different. However, it seems to me
> they are not worth to concentrate on. Static analyzer is a tool of
They are a tool for prioritization. As I said - bugs are bugs and
should be fixed. Knowing if your bug count is particularly bad or now
compared to "the industry at large" etc. lets you know how much effort
or money or time to spend on these issues.
I have taken a position in upstream projects to just slowly fix the
reported issues over a long period of time. every release - fix some
more so every release has the bug rate go down. So you mix feature
addition and static analysis triage, but focus on the feature
development etc. and less so on triage.
> finding bugs in fresh code. Yes, old mistakes are also worth to be
> fixed, but generally they are not as critical as new ones. Actually,
> if the error is in the code for several years,it means that it rarely
> reveals itself or interferes no one. That’s why it is interesting to
> look to the future rather than the past. Sure, the PVS-Studio
> analyzer can be a good assistant for a programmer.
Indeed I agree there.
> About the percent of false positives. It makes no sense to talk about
> them without first configuring the analyzer. It's a lot of work,
> which we are ready to engage, if a cooperation begins someday. Can we
> deal with it? Yes, we can:
That sounds like a fair bit of work.
> I think when it comes to projects of such a large size as Tizen, it
> makes sense to speak not only on product licensing, but also on a
> great support, carried out by our team.
> P.S. On Monday I will demonstrate that the analyzer can be useful not
> only for finding bugs, but also regarding micro-optimizations. :)
Actually ... that does interest me. That's a new use of static
analysis. I think that this would be generally well received in Tizen
too. If PVS Studio can do this too... it just bumped up in value for
me. I'd like to see what these reports are and what they point out etc.
> Best regards,
> Andrey Karpov, Microsoft MVP,
> Ph.D. in Mathematics, CTO
> "Program Verification Systems" Co Ltd.
> On 14.07.2017 4:35, Carsten Haitzler wrote:
> > On Thu, 13 Jul 2017 14:26:35 +0300
> > Andrey Karpov <karpov at viva64.com> wrote:
> > Could you:
> > 1. Include this as a bugs per 1k lines of code or similar metric?
> > Total bugs is not that useful without knowing total size of code
> > looked at. At least in the summary.
> > 2. Include metrics calculated similarly for other major projects
> > (Linux kernel, etc. etc.).
> > Why? The below is like saying "you're doing 120km/h!!!!!!" ... but
> > if it's on a freeway and the speed limit is 130km/h ... in context
> > it's very different. This here lacks context.
> > As I haven't used PVS studio before (it's on a list of things to try
> > out and see if it's good), but I do know Coverity's scan service
> > very well, I'll do some back of a napkin numbers:
> > 1. In my experience about ~10-15% of bugs are false positives etc.
> > with coverity.
> > 2. Coverity says Linux kernel gets 0.48 issues peer 1k lines of
> > code. applying the above false positive rate, let's call that 0.40.
> > Qt gets 0.72, so lets call that 0.61 adjusting for false positives.
> > Glib gets 0.45, so 0.38 accounting for false positives. So:
> > With your numbers, Tizen sees 900 issues in 2.4 million lines of
> > code. that comes out at 0.38.
> > Linux kernel = 0.40
> > Qt = 0.61
> > Glib = 0.38
> > Tizen = 0.38
> > Yes PVS studio is a different tool to coverity. I'm making an
> > assumption (much like you do too in many ways) that these two tools
> > are in the same ballpark and will report similar issues and
> > numbers, but may be disjoint sets. I'm going with this assumption
> > because you didn't provide other numbers to go by, and it'd be nice
> > to.
> > My conclusion is that Tizen code quality is pretty decent in the
> > scheme of things. It's bug rate is pretty low-ish.
> > Now on the other side, it';s always great to have tools point out
> > possible errors. Another tool is another weapon in a war chest to
> > improve code quality. That's a good thing. Bugs should be looked
> > into and addressed accordingly based on actual severity and
> > context. just blindly fixing issues will result in misallocation of
> > time and resources because it may be an issue in a debug tool that
> > is rarely used and only for gathering quick information by a
> > developer when something goes wrong... it may be a seriously
> > exploitable bug in code that is always able to be triggered
> > remotely. So context is important. Knowing issues are there and
> > what a tool thinks they are is a great speedup vs full code review.
> > PVS Studio is indeed such a tool. There are others too. We have
> > tools of our own we're using more and more.
> >> Hello All,
> >> This article will demonstrate that during the development of large
> >> projects static analysis is not just a useful, but a completely
> >> necessary part of the development process. This article is the
> >> first one in a series of posts, devoted to the ability to use
> >> PVS-Studio static analyzer to improve the quality and reliability
> >> of the Tizen operating system. For a start, I checked a small part
> >> of the code of the operating system (3.3%) and noted down about
> >> 900 warnings pointing to real errors. If we extrapolate the
> >> results, we will see that our team is able to detect and fix about
> >> 27000 errors in Tizen. Using the results of the conducted study, I
> >> made a presentation for the demonstration to the Samsung
> >> representatives with the offers about possible cooperation. The
> >> meeting was postponed, that is why I decided not to waste time and
> >> transform the material of the presentation to an article:
> >> https://www.viva64.com/en/b/0519/
> >> ----
> >> Best regards,
> >> Andrey Karpov, Microsoft MVP,
> >> Ph.D. in Mathematics, CTO
> >> "Program Verification Systems" Co Ltd.
> >> _______________________________________________
> >> Dev mailing list
> >> Dev at lists.tizen.org
> >> https://lists.tizen.org/listinfo/dev
More information about the Dev