At the moment I’m struggling with Microchip’s new “Harmony” framework for the PIC32. I don’t want to say bad things about it because (a) I haven’t used it enough to give a fair opinion and (b) I strongly suspect it’s a useful thing for some people, some of the time.

Harmony is extremely heavyweight. For example, the PDF documentation is 8769 pages long. That is not at all what I want – I want to work very close to the metal, and to personally control nearly every instruction executed on the thing, other than extremely basic things like <stdlib.h> and <math.h>.

Yet Microchip says they will be supporting only Harmony (and not their old “legacy” peripheral libraries) on their upcoming PIC32 parts with goodies like hardware floating point, which I’d like to use.

So I’m attempting to tease out the absolute minimum subset of Harmony needed to access register symbol names, etc., and do the rest myself.

My plan is to use Harmony to build an absolutely minimum configuration, then edit down the resulting source code to something manageable.

But I found that many of Microchip’s source files are > 99% comments, making it essentially impossible to read the code and see what it actually does. Often there will be 1 or 2 lines of code here and there separated by hundreds of lines of comments.

So I wrote the below Python script. Given a folder, it will walk thru every file and replace all the .c, .cpp, .h, and .hpp files with identical ones but with all comments removed.

I’ve only tested it on Windows, but I don’t see any reason why it shouldn’t work on Linux and Mac.

from __future__ import print_function
import sys, re, os

# for Python 2.7
# Use and modification permitted without limit; credit to NerdFever.com requested.

# thanks to zvoase at http://stackoverflow.com/questions/241327/python-snippet-to-remove-c-and-c-comments
# and Lawrence Johnston at http://stackoverflow.com/questions/1140958/whats-a-quick-one-liner-to-remove-empty-lines-from-a-python-string
def comment_remover(text):
    def replacer(match):
        s = match.group(0)
        if s.startswith('/'):
            return " " # note: a space and not an empty string
        else:
            return s
    pattern = re.compile(
        r'//.*?$|/\*.*?\*/|\'(?:\\.|[^\\\'])*\'|"(?:\\.|[^\\"])*"',
        re.DOTALL | re.MULTILINE
    )
    
    r1 = re.sub(pattern, replacer, text)
    
    return os.linesep.join([s for s in r1.splitlines() if s.strip()])


def NoComment(infile, outfile):
        
    root, ext = os.path.splitext(infile)
    
    valid = [".c", ".cpp", ".h", ".hpp"]
    
    if ext.lower() in valid:
           
        inf = open(infile, "r")

        dirty = inf.read()
        clean = comment_remover(dirty)

        inf.close()
        
        outf = open(outfile, "wb") # 'b' avoids 0d 0d 0a line endings in Windows
        outf.write(clean)
        outf.close()
        
        print("Comments removed:", infile, ">>>", outfile)
        
    else:

        print("Did nothing:     ", infile)

if __name__ == "__main__":
    
    if len(sys.argv) < 2:
        print("")

        print("C/C++ comment stripper v1.00 (c) 2015 Nerdfever.com")
    
        print("Syntax: nocomments path")

        sys.exit()
        
    root = sys.argv[1]
    
    for root, folders, fns in os.walk(root):

        for fn in fns:
    
            filePath = os.path.join(root, fn)
            NoComment(filePath, filePath)
    

To use it, put that in "nocomments.py", then do:

python nocomments.py foldername

Of course, make a backup of the original folder first.