John Tsiombikas nuclear@mutantstargoat.com
15 August 2008
I get really excited each time my favourite development environment (UNIX + vi), lets me perform a task of insurmountable tediousness, in a split second. So much in fact, that I can't hold myself from sharing it with others. And since updates on this blog are scarce anyway, I decided to write about it.
The tedious task in question, was extracting a shitload of protocol replies (for instance error codes) from an RFC, and writing the appropriate enumerations for them in a C program I'm writing. The stupid way of doing that, typing a hundred enumeration names and their values, is boring, slow and error-prone. However, there is a better way of doing it.
First step is to determine how to extract the error names and their numeric values from the rfc text. From a quick examination of the relevant rfc part, it's apparent that all of them are on lines of their own, with some leading whitespace, followed by the numeric value, more whitespace, and finally the name of the error. For instance here is one of them:
431 ERR_NONICKNAMEGIVEN
":No nickname given"
- Returned when a nickname parameter expected for a
command and isn't found.
So, it's dead easy to extract them using standard UNIX magic:
egrep '^.*[0-9]+.*ERR_' ~/docs/rfc/irc.txt
Now, in order to insert the output of that grep invocation into our code, the
'!' vi command comes in handy. It replaces something in the buffer, with the
output of an external program. So, I proceeded to open an enum { ... };
block,
moved the cursor in an empty line between the openning and the closing brace
and typed:
!!egrep '^.*[0-9]+.*ERR_' ~/docs/rfc/irc.txt
The result is something like the following:
enum {
444 ERR_NOLOGIN
445 ERR_SUMMONDISABLED
446 ERR_USERSDISABLED
451 ERR_NOTREGISTERED
.... many of those ...
};
Which of course is not exactly what we need for a C enumeration. We need to turn the above into:
enum {
ERR_NOLOGIN = 444,
ERR_SUMMONDISABLED = 445,
ERR_USERSDISABLED = 446,
ERR_NOTREGISTERED = 451,
.... more ...
};
This turns out to be a snap with the vi regular expression search and replace facility.
:%s/[ ]\+\([0-9]\+\).*\(ERR_.*\)/\t\2 = \1,/
The regular expression "[ ]\+\([0-9]\+\).*\(ERR_.*\)
" matches anything
that starts with one or more spaces, followed by one or more digits (we place
the digit in a group by parenthesizing this part), then any number of
characters followed by a string starting with ERR_
(placed in a second group).
The whole thing is then replaced by a tab, followed by the second group, an
equals sign, the first group, and finally a comma ("\t\2 = \1,
").
And presto, we've got our C enumeration, with minimal effort.
Then, by following the same procedure for protocol commands and replies (other than errors), I was able to create a 200-line header file in one or two minutes.
This was initially posted in my old wordpress blog. Visit the original version to see any comments.