Source Code Analysis Gem


SAN or Source ANalysis is a Ruby gem for analyzing the contents of source code including comment to script ratios, todo items, declared functions, classes, and much more.

Supported Languages

  • Ruby
  • PHP
  • JavaScript
  • CSS

Why?

This is a practical way to analyze new frameworks you may be considering, by analyzing documentation available, or simply curious about its code-base and what it is made up of. This tool is also useful for analyzing your own projects both for personal interest or for aiding in the quote process.

Installation

First open a terminal window and add Github to the gem sources, allowing Github to serve gems.


$ gem sources -a http://gems.github.com

Next we need to install the gem itself which is done by the command below.


$ sudo gem install visionmedia-san

Now you should have installed the source code analysis gem and be on your way!

Usage

Scan the current directory (non-recursive).

$ san .

declared functions 43
files 5
files php 5
lines 1899
lines blank 218
lines comments 436
comment ratio 0.23

Recursively analyze the current directory.

$ san -r .

declared classes 207
declared functions 4990
files 585
files css 90
files inc 196
files install 49
files js 52
files module 84
files php 114
lines 168395
lines blank 16154
lines comments 45953
lines todo 208
comment ratio 0.27

Single file analysis.

$ san index.php

files 1
files php 1
lines 39
lines blank 5
lines comments 13
comment ratio 0.33

Retrieving one metric, where 'lines comments', may be any line outputted from the source code analysis tool.

$ san index.php | grep 'lines comments' | awk '{print $3}'

Comments

Nice! the island is great.
To be honest I am not sure what I would name that method to disambiguate it, I dont like long method names for methods that would be potentially used largely, but maybe 'humanize_bytes' or 'to_readable_filesize' or 'bytes_to_human', I am not sure haha, would have to see how it writes out.

Oh, and I would not call that method "to_filesize", with comments or without. The problem is that it won't accurately express what it does where it's called.

Those examples would be provided by unit tests.

One problem with comments is that there's no guarantee they match the actual code. Tests always match the actual code.

(I can't forgo mentioning that I was born in Nanaimo. :)

I would honestly still disagree though, take the following example:

def to_filesize(bytes)
  # Code to convert to human readable filesize here..
end

or

def to_human_readable_filesize(bytes)
  # Code to convert to human readable filesize here..
end

or

# Convert filesize in bytes to a human readable representation.
#
# Examples:
#   to_filesize(1024) => '1 KiB'
#   1024.to_filesize  => '1 KiB'
#
# See Integer.to_filesize
#
def to_filesize(bytes)
  # Code to convert to human readable filesize here..
end

I prefer the last method, as you can reference examples, and other methods that may be of interest.

Sure, much larger scope for sure, I cannot imagine a code-base such as AS3 EVER leaving out documentation, but its an interesting concept, I will take a look at that article.

If the Drupal code needs comments in order to be understood, then it needs comments. Of course, library code has a much higher threshold for "being understood" than application code.

The last few places I've worked had a "no comments unless necessary" policy. As a result, we had to learn to write readable code. I think there were fewer bugs as a result.

See http://c2.com/cgi/wiki?ToNeedComments . There's plenty of disagreement on this issue, but it's worth examining.

Not true.. I would challenge you to look at a large code-base such as Drupal, strip out all comments, and understand how to use every routine. Many routines are polymorphic, something which you would never know without documentation, so you may not even be utilizing its power. I have nothing against your statement, your of course free to believe that, I just highly disagree.

I wasn't talking about file size - I was talking about writing code that's easy to understand. If the code is easy to understand, it doesn't need commenting. If it's not easy to understand, it does. We should always strive to write code that's easy to understand, hence we should strive for as few comments as possible.

Great! Any links back would be awesome, its still a tiny project but I hope to support more languages and features soon.

This is not always true about having a small comment ratio, its parsed out anyways so it is no burden to have. Clarity is much more important than one-liners and slim filesize. Personally I would consider a high ratio awesome, however I would agree that in most cases the terminology used should make things obvious, yet this is inescapable at times.

I just finished writing a post reviewing 10 Ruby tools, which I finished today. I included a link to your tool at the bottom, but didn't have a chance to check it out yet. Always nice to see some new tools for the Ruby community.

Of course, the comment ratio should be as low as possible, right? Only unobvious code should be commented and unobvious code should be minimized.