Wholly Unbalanced Parentheses: 2012

Thursday, December 20, 2012

Learn Vimscript the Hard Way, Commentary II

As I wrote in my first commentary on Steve Losh’s Learn Vimscript the Hard Way, I think it is a worthwhile book for people to read to get more familiar with the mechanics of customising their Vim environment.

What about those who want to use it to actually learn Vimscript?

It does an acceptable job of that too.

The chapters on Vimscript itself (19-27 and 35-40) cover the syntax and semantics of the language with examples and exercises spread throughout to give the learner necessary hands on experience.

I want to stress here that I feel Steve didn’t intend the what of those examples to be used literally in anyone’s vimrc files or personal plugins (in fact, I believe that to be true of the whole book at large) but rather the how of techniques shown. Don’t create your own little maps in your ~/.vimrc file for commenting lines in various filetypes and don’t write your own toy snippets system — very good plugins exist for these purposes already. Do learn that you can do these sorts of things so that when the time comes for you to really write something new, you will know how to.

Steve also provides two larger exercises starting respectively at chapters 32 and 41. The first is a new operator to grep for the motioned text, and the second is a full blown Plugin for a new programming language. Both serve as good models for the sort of larger works of the practising VimLer.

I do recommend this book because it’s freely available to read online. Another resource I would recommend for learning Vimscript is Damian Conway’s five part developerWorks article series, Scripting the Vim Editor — that’s how I first got into VimL (VimL is short for Vim Scripting Language and is another name for Vimscript). Vim’s built-in :help usr_41 is the user guide to writing Vim scripts and :help eval.txt is the reference manual on VimL’s expression evaluation.

Do you have a favourite resource for learning VimL?

[update]
Oops... I forgot to add my remarks on some of the technical aspects of Steve's work:

As Steve says, always use :help nore maps until you know you need otherwise.
In the same vein, always use :normal! (instead of the oft shown :normal) to avoid user-defined keymaps on the right hand side.
The :echom command (and friends) expects the evaluations of its expressions to be of type string. Use :help string( to coerce lists and dictionaries to strings for use in these commands. E.g. :echom string(getline(1, '$'))
Vim's help system is context aware based on the format of the help tag. See :help help-context for the list of formats.

Learn Vimscript the Hard Way, Commentary I

Steve Losh has written a book called Learn Vimscript the Hard Way.

It’s badly titled, imho. His definition of the book explains why I say that: a book for users of the Vim editor who want to learn how to customize Vim. With that description in mind, I think the book achieves its goal — a goal that all vimmers would aspire to master. However with the title of the book, I fear even many proficient vimmers would assume that the material is out of their reach or too dense to absorb right now with their busy schedules, relegating it to a later reading pile, at best.

For all you up and coming Vimmers looking to read something to take you beyond all of the beginner tutorials out there, read chapters: 0-18, 28-32, 43-48, 50 & 56.

For those of you who picked the book up specifically because of its title, that review is coming soon. :-)

Sunday, December 16, 2012

call() for a Good Time

Simple functions in Vim are declared like this:

function! A(a, b, c)
  echo a:a a:b a:c
endfunction

call A(1, 2, 3)

There’s probably nothing surprising there except for the a:a syntax, which is how Vim insists on accessing the function’s arguments (mnemonic: a: for argument).
Just as simple is calling function A() from another function, B(), passing its arguments directly along to A():

function! B(a, b, c)
  return A(a:a, a:b, a:c)
endfunction

call B(1, 2, 3)

Nothing surprising there at all. But we’ve just laid the groundwork for the main attraction tonight. In VimL, you can call a function using the library function call(func, arglist) where arglist is a list. If you’re calling a function that takes multiple arguments, collect them in an actual list like this:

function! C(a, b, c)
  return call("A", [a:a, a:b, a:c])
endfunction

call C(1, 2, 3)

If you already have the elements in a list, no need to wrap it in an explicit list:

function! D(a)
  return call("A", a:a)
endfunction

call D([1, 2, 3])

Let’s step it up a notch. What if you want to be able to accept the args as either separate arguments or as a list? Vim has your back with variadic functions cloaked in a syntax similar to C’s: …

Variadics in the key of V:

a:0 is a count of the variadic arguments
a:000 is all of the variadic arguments in a single list
a:1 to a:20 are positional accessors to the variadic arguments

So now it doesn’t matter how we receive the arguments — standalone or in a list — we can keep Vim happy and call A() appropriately.

function! E(...)
  if a:0 == 1
    return call("A", a:1)
  else
    return call("A", a:000)
  endif
endfunction

call E(1, 2, 3)
call E([1, 2, 3])

Ok. That’s not too bad; it’s perhaps a little awkward. We’re calling A() directly here, but it shouldn’t be a surprise to see that we can call C() in the same way too:

function! F(...)
  if a:0 == 1
    return call("C", a:1)
  else
    return call("C", a:000)
  endif
endfunction

call F(1, 2, 3)
call F([1, 2, 3])

Pretty straightforward. What about calling D() instead which expects a single list argument? Hmm… if Vim wants a list, give him a list:

function! G(...)
  if a:0 == 1
    return call("D", [a:1])
  else
    return call("D", [a:000])
  endif
endfunction

call G(1, 2, 3)
call G([1, 2, 3])

It’s worth stopping briefly here to consider what call() is doing to that arglist: It’s splatting it (extracting the arguments and passing them as separate members to the called function). Nice. Wouldn’t it be nice if we could splat lists ourselves? Well, be envious of Ruby coders no more because we can splat lists in VimL!

Splat!

To splat a list into separate variables (a, b and c here):

let [a, b, c] = somelist

Read :help :let-unpack for the juicy extras.

I like the splatting approach because it gives us variable names to play with inside our function:

function! H(...)
  if a:0 == 1
    let [a, b, c] = a:1
  else
    let [a, b, c] = a:000
  endif
  return D([a, b, c])
endfunction

call H(1, 2, 3)
call H([1, 2, 3])

Of course, it works just as well for calling functions with explicit multiple arguments, like C():

function! I(...)
  if a:0 == 1
    let [a, b, c] = a:1
  else
    let [a, b, c] = a:000
  endif
  return C(a, b, c)
endfunction

call I(1, 2, 3)
call I([1, 2, 3])

You’ll notice that the splat semantics are identical between H() and I() and only the call of D() and C() change, respectively. This is very neat, I think.
So far we’ve been calling through to functions that call A() directly. Happily, we can call through to one of these dynamic functions (like E(), but any would work as well) and have it Just Work too:

function! J(...)
  if a:0 == 1
    let [a, b, c] = a:1
  else
    let [a, b, c] = a:000
  endif
  return E(a, b, c)
endfunction

call J(1, 2, 3)
call J([1, 2, 3])

So, that’s it. Vim has variadic functions and splats. And splats are my recommended pattern for handling deep call chains between variadic functions.
There’s one last, cute, little thing about splats: you can collect a certain number of explicit arguments as you require, and then have any remaining arguments dumped into a list for you. The rest variable here will be a list containing [4, 5, 6] from the subsequent calls:

function! K(...)
  if a:0 == 1
    let [a, b, c; rest] = a:1
  else
    let [a, b, c; rest] = a:000
  endif
  echo "rest: " . string(rest)
  return E(a, b, c)
endfunction

call K([1, 2, 3, 4, 5, 6])
call K(1, 2, 3, 4, 5, 6)

And I thought this was going to be a short post when I started. I almost didn’t bother posting it because of that reason.

Saturday, December 1, 2012

Rapid Programming Language Prototypes with Ruby & Racc, Commentary

I just watched a tolerable ruby conference video by Tom Lee on Rapid Programming Language Prototypes with Ruby & Racc.

What he showed he showed fairly well. His decision to "introduce compiler theory" was, he admitted, last-minute and the hesitation in its delivery bore testimony to that. The demonstration of the compiler pipeline using his intended tools (ruby and racc) was done quite well with a natural progression through the dependent concepts along the way. By the end of the talk he has a functional compiler construction tool chain going from EBNF-ish grammar through to generated (and using gcc, compiled) C code.

I was surprised that nobody in the audience asked the question I was burning to ask from half way through the live-coding session: Why not use Treetop? (or the more generic: why not use a peg parser generator or a parser generator that does more of the heavy lifting for you?)

The whole point of Tom's presentation is: use ruby+racc because it saves you from all the headaches of setting up the equivalent tool chain in C/C++. And it does, he's right. But it feels to me that Treetop does even more of that hard work for you, allowing you to more quickly get to the fun part of actually building your new language. I'm angling for simplicity here.

I could be wrong, though, so let me ask it here (as Confreaks seems to not allow comments): Why not treetop (or an equally 'simple' parser generator) for something like this? (and answers along the lines of EBNF > PEG are not really what I'm after, but if you have a concrete example of that I'd like to hear it too.)

On a completely separate note: Tom, you need to add some flying love to your Vim habits. :-)

Thursday, November 8, 2012

Vim Motions

One of the more frequent admonishments delivered on #vim to the whining novice or the curious journeyman is to master the many motions within the editor. Previously, a bewildering list of punctuation and jumbled letters was unceremoniously dumped on the complainant with the misguided expectation that they'd then take themselves off and get right to the task of memorising the eighty odd glyphs. We mistook their silence for compliance but I rather suspect it was more bewilderment or repulsion or sheer paralysis. In an attempt to friendly that mess up, I have started an infographic series intended to cover the twelve major categories, probably spread over six separate infographics.

The Vim Motions Infographic Series (in 9 parts):

1. Line & Buffer
2. Column
3. Word
4. Find
5. Search
6. Large Objects
7. Marks, Matches & Folds
8. Text Objects (not motions, but mesh nicely at this point)
9. Creating your own Text Objects

I plan to have a different expression on the chibi's face in each of the pages. I'll move the crying one from the Large Object page (as shown below) to page 1 and then progressively improve her mood through the remaining pages: something like -- crying, disappointment, resignation, hope, amazement, happiness, confidence, smugness and something devilish. As an update on that, I have inked five of the chibis now. I look forward to having them all up in their own infographics.

I decided to have the background colour change to suit the mood of the chibi, starting from black in image number one to represent depression and despair. I will roughly follow the same colour spread I used on the How Do I Feel graphic.

I have no experience in putting together a multi-page piece like this. Feedback certainly welcome. I was vaguely thinking of having it a bit like a magazine or comic book spread, but I don't know how to do that or whether it's the right or even a good approach.

Legend:
Green indicates cursor origin before issuing the motion.
Red indicates cursor destination at the end of the motion.
Orange shows the area covered by the motion. This would be the same area highlighted in Vim if a visual operator was used with these motions.

1. Line & Buffer Motions

2. Column Motions

6. Large Object Motions

The Many Faces of % in Vim

Pity the poor Vimmer for he has so many a face to put to percent:

Help Topic	Description
`N%`	go to {count} percentage in the file
`%`	match corresponding [({})] (enhanced with matchit.vim plugin)
`g%`	enhanced match with matchit.vim plugin — cycle backwards through matches
`:%`	as a range, equal to :1,$ (whole file)
`:_%`	used as an argument to an :ex command as the name of the current file
`"%`	as a register, the name of the current file
`expr-%`	in VimL as modulo operator
`expand(),` `printf() and bufname()`	in VimL use % in printf-like format specifiers
`'grepformat'`, `'errorformat'`, `'shellredir'`, `'printheader'` and `'statusline'`	various options use % as a printf-like format specifier
Regular Expression Atoms:
Match locations:
`\%#`	cursor position
`\%'`	position of a mark
`\%l`	specific line
`\%c`	specific column
`\%v`	specific virtual column

`\%(`	non-backref capturing group
`\%[`	sequence of optionally matched atoms
Numeric character specifier in matches:
`\%d`	decimal
`\%o`	octal
`\%x`	hex (2 digits)
`\%u`	hex (4 digits)
`\%U`	hex (8 digits)
Absolute file or string boundaries:
`\%^`	start of file (or start of string)
`\%$`	end of file (or end of string)

`\%V`	match inside visual area

Wednesday, November 7, 2012

Vim's Pipes

Thursday, October 11, 2012

A Little Drop of Prudence

I like my Vim with a little drop of prudence

I frequently create temporary macros or maps for ad hoc edits when I find myself having to do the same job more than a few times over. This is a good thing to spend some time reflecting on in your editing. If you’re in the heat of the moment and don’t want to break concentration or waste time on R&D right now, make a note to come back when you have time to look at your current editing inefficiency. I recommend setting up a practise file (something I mention in learnvim) which you can quickly jump to using a global bookmark.

Setting up a Practise File

:e ~/vim-practise.txt
mP

This sets a global mark that can be jumped to from anywhere within vim using the normal mode ' command. (:help 'A)

Jumping to your Practise File

'P

So, there you are editing away on another dreary Tuesday and in a moment of lucidity you realise you’ve just mashed the same key pattern a dozen times over — you’ve just discovered an inefficiency! Awesome.

Quick check: “Do I have time to investigate and optimise this now?”

No: :-( Sucks to be you. Quick! To the Practise File! Make a quick note about this so that you can come back to it on your morning tea break.

Yes: :-) You soldier! Yank a sample snippet of the problem at hand and then… Quick! To the Practise File! Paste in your snippet and start experimenting with ways to optimise the necessary changes. Is there a map or macro you can make to wrap these steps up into a fast and simple solution? Take your macro/map back to the real work you were doing before this R&D diversion and finish the rest of those lame edits with the genuine vim you should be applying to life.

“But couldn’t I have just experimented in my original file?”

Sure… but then you lose the problem. You’re left with just a finished solution. A useless page of sterile, problemless text. That might please your boss and clients, but it’s just no good for your continued development as a vimmer. You’ll grow more by squirreling away interesting little nasties like this that you find in the wild so that you can revisit them during quieter moments as part of your Deliberate Practise regimen.

Sunday, September 9, 2012

Genetic Algorithms in VimL (Part I)

Burak Kanber is into machine learning. I was entertained by his Hello World genetic algorithm example and, in alignment with his implementation language agnosticism, I thought I'd write a version in VimL:

This depends on vim-rng for the random number stuff.

let s:Chromosome = {}

function! s:Chromosome.New(...)
  let chromosome = copy(self)
  let chromosome.code = ''
  let chromosome.cost = 9999
  let chromosome.pivot = 0

  if a:0
    let chromosome.code = a:1
    let chromosome.pivot = (strchars(chromosome.code) / 2) - 1
  endif

  return chromosome
endfunction

function! s:Chromosome.random(length)
  let self.code = RandomString(a:length)
  let self.pivot = (a:length / 2) - 1
  return self
endfunction

function! s:Chromosome.mutate(chance)
  if (RandomNumber(100) / 100.0) < a:chance
    let index = RandomNumber(1, strchars(self.code)) - 1
    let upOrDown = RandomNumber(100) <= 50 ? -1 : 1
    let exploded = split(self.code, '\zs')
    let change = nr2char(char2nr(exploded[index]) + upOrDown)
    if index == 0
      let self.code = change . join(exploded[index+1:], '')
    else
      let self.code = join(exploded[0:index-1], '') . change . join(exploded[index+1:], '')
    endif
  endif
  return self
endfunction

function! s:Chromosome.mate(chromosome)
  let child1 = strpart(self.code, 0, self.pivot) . strpart(a:chromosome.code, self.pivot)
  let child2 = strpart(a:chromosome.code, 0, self.pivot) . strpart(self.code, self.pivot)
  return [s:Chromosome.New(child1), s:Chromosome.New(child2)]
endfunction

function! s:Chromosome.calcCost(compareTo)
  let total = 0
  let i = 0
  while i < strchars(self.code)
    let diff = char2nr(self.code[i]) - char2nr(a:compareTo[i])
    let total += diff * diff
    let i += 1
  endwhile
  let self.cost = total
  return self
endfunction

function! s:Chromosome.to_s()
  return self.code . ' (' . string(self.cost) . ')'
endfunction


let s:Population = {}

function! s:Population.New(goal, size)
  let population = copy(self)
  let population.members = []
  let population.goal = a:goal
  let population.generationNumber = 0
  let population.solved = 0

  let size = a:size
  let length = strchars(population.goal)
  while size > 0
    let chromosome = s:Chromosome.New()
    call chromosome.random(length)
    call add(population.members, chromosome)
    let size -= 1
  endwhile

  return population
endfunction

function! s:Population.display()
  % delete
  call setline(1, "Generation: " . self.generationNumber)
  call setline(2, map(copy(self.members), 'v:val.to_s()'))
  redraw
  return self
endfunction

function! s:Population.costly(a, b)
  return float2nr(a:a.cost - a:b.cost)
endfunction

function! s:Population.sort()
  call sort(self.members, self.costly, self)
endfunction

function! s:Population.generation()
  call map(self.members, 'v:val.calcCost(self.goal)')

  call self.sort()
  call self.display()

  let children = self.members[0].mate(self.members[1])
  let self.members = extend(self.members[0:-3], children)

  let i = 0
  while i < len(self.members)
    call self.members[i].mutate(0.5)
    call self.members[i].calcCost(self.goal)
    if self.members[i].code == self.goal
      call self.sort()
      call self.display()
      let self.solved = 1
      break
    endif
    let i += 1
  endwhile

  let self.generationNumber += 1

  return self
endfunction

enew
let population = s:Population.New('Hello, world!', 20)
while population.solved != 1
  call population.generation()
endwhile

To see it run save it in a file and type :so % from within vim.

Why, dear bairui, you ask? Well, at this stage... I don't know. It just looked like fun. However, a couple of wild thoughts occurred to me: finding the ideal (good enough; as in 'correct' enough) combination of various vim options to achieve a desired look and behaviour. Take for example the various C indenting styles - what mad combination of &cinoptions, &cinkeys and &cinwords would you need to achieve Frankenstein's Indentation Style? What about getting &formatlistpat right for your preferred markup style? Sure, these might be totally hair-brained ideas -- but they might give you an idea for something less hairy and actually useful. Either way, I plan to keep playing with Burak's tutorial as he progresses through it. Thanks, Burak! :-)

Wednesday, August 15, 2012

A PEG Parser Generator for Vim

Barry Arthur
v1.2, August 15, 2012
v1.1, October 10, 2011

What is VimPEG?

VimPEG is a Parser Generator which uses the newer Parsing Expression Grammar formalism to specify parse rules.

Why VimPEG?

Vim is a powerful editor. It has lots of features baked right in to make it an editor most awesome. It has a deliciously potent regular expression engine, jaw-dropping text-object manipulations, and fabulous scriptability -- just to name a few of its aces.

One thing our little Vim still lacks, though, is an actual parser. Regular expressions will only get you so far when you're trying to analyse and understand complex chunks of text. If your text is inherently infinite or recursive, then regular expressions become at best combersome, and at worst, useless.

So, Vim needs a parser. I've needed one myself several times when wanting to build a new plugin:

Awesome! This idea will so rock! Now all I need to do is parse <SomeLanguage> and I'll be able to... awww... :-(

I've seen people ask on #vim: How can I <DoSomethingThatNeedsAParser>? And invariably the answer is: You can't. Not easily, anyway. Vimscript is a capable enough language to write your own parser in, but a little alien to most to do so.

You could also use one of the many language bindings that Vim comes bundled with these days to use a parser library in your favourite scripting language. The problem being that your code will only then run on a similarly compiled Vim (not everyone enables these extra language bindings) and with your parser library dependencies.

Beyond those two options, the world of parsing in Vim is quite scant. There exist a small handful of purpose-built recursive descent parsers that target a specific task (like parsing json), but for the general case -- a parser-generator -- you're out of luck. Until now. VimPEG aims to solve this problem.

VimPEG aims to be a 100% VimL solution to your parsing needs.

What would I use VimPEG for?

You've come to that paralysing sinkhole in your Vimming when you've said to yourself, "Damn... I wish Vim had a parser."
You've asked for something on #vim and the reply is "you can't do that because Vim doesn't have a parser."
You're up to your neck in recklessly recursive regexes.

Some ideas:

An expression calculator (the beginnings of which we explore here.)
Expanding tokens in typed text (think: snippets, abbrevs, maps.)
Semantic analysis of code -- for refactoring, reindenting (but sadly not syntax highlighting yet.)
C Code bifurcation based on #define values -- want to see what the code would look like with #define DEBUG disabled?
Coffeescript for Vim -- sugar-coating some of the uglies in VimL -- this example will be presented in a subsequent VimPEG article.

In fact, most of these ideas have been explored in part inside the examples/ directory of the VimPEG plugin.

For the purposes of introducing VimPEG and parsing in general (if you're new to it), let's consider a fairly easy example of reading and understanding (perhaps calculating) a sum series of integers. They look like this:

1 + 2 + 12 + 34

NOTE: Vim can already do this for you, so writing a parser for it here is purely pedagogical -- it's a simple enough example without being utterly devoid of educational value. I hope.

The list can be any (reasonable) length, from a single integer upwards. So, this is a valid input to our parser:

123

As are all of the following:

1 + 2
3 + 4 + 5
123 + 456 + 789

Stop. Right now. And think: How would you parse such an arbitrarily long series of integers separated by + operators? What tool would you reach for? What if you had to do it in Vim? And :echo eval('1 + 2 + 3') is cheating. :-p

We'll continue to use this example throughout this article and eventually show you how VimPEG solves this little parsing requirement.

But first, let's make sure we're all on the same page about the question: What is parsing?

Parsing

Feel free to skip to the next section if you're comfortable with the following concepts:

parsing
pasrer generators
(E)BNF and PEGs

Let's begin by defining some terms:

What is 'Parsing'?

Parsing is making sense of something. When we want a computer to understand something we've written down for it to do, it needs to 'parse' that writing. Without going into too much detail yet, let's consider a sentence uttered at one time or another by your parental unit: "Take the rubbish out!". When you (eventually -- after you unplug your iPod, put down your PS3 controller, pocket your smart-phone and wipe the disdain off your face) parse this sentence, your brain goes through two processes:

firstly, syntax recognition:

it scans the words to make sure they're legitimate:

they're in a language you know
they're all valid words, and
they're all in the right order

and secondly, semantic analysis:

it filters out the 'meaning' and presents that to a higher actor for further deliberation

In this case, the parser would extract the verb phrase 'take out' and the noun 'rubbish'. Your higher self (sarcasm aside) knows where this magic 'out' place is. We'll come back to these two processes ('syntax recognition' and 'semantic analysis') later.

In the case of our sum series of integers, syntax recognition would involve collecting the sequence of digits that comprise an integer, skipping unnecessary whitespace and expecting either an end of input or a + character and another integer and... so on. If the input contained an alphabetic character it would fail in this phase -- alphabetic characters are just not expected in the input. If the lexical recogniser found two integers separated by whitespace or two + characters in a row... it would not fail in this phase -- these are all valid tokens in 'this' lexical recogniser.

I am describing the more general process of lexical recognition and it being a separate stage to semantic analysis which is typical of a lot of parsers. PEG parsers, however, do not have separate phases as described here -- they are quite strict about not only what shape the next token must have, but also its purpose in this place (context) of the input. Having two consecutive integers or two consecutive + characters will upset a PEG parser expecting a sum series of integers -- it's just that it gets upset all in its single parse phase.

The semantic analysis phase is all about doing something
"meaningful" with the collected integers. Maybe we should sum them? Maybe we just want to pass back a nested list structure representing the parse tree, like this:

[1, '+', [2, '+', [3, '+', 4]]]

given this input:

1 + 2 + 3 + 4

Either way, whatever is done, it's the job of the semantic analysis phase to do so. In our example in this article, we produce a sum of the collected integer series. So, our parser would return: 10 for the example input given above.

What is a 'Parser Generator'?

Writing a parser is not easy. Well, it's not simple. It's fussy. It's messy. There's a lot of repetition and many edge cases and minutia that bores a good coder to tears. Sure, writing your first recursive descent parser is better than sex, but writing your second one isn't. Writing many is as much fun as abstinence. Enough said.

So, we (as fun loving coders) want a better alternative. Parser generators provide that alternative. They generate parsers; which means they do all the boring, tedious, repetitive hard-labour and clerical book-keeping stuff for us. I hope I've painted that with just the right amount of negative emotion to convince you on a subliminal level that Parser Generators are a Good Thing(TM).

How do they generate a parser? or What's a 'PEG'?

Parser Generators are told what to expect (what is valid or invalid) through a grammar -- a set of rules describing the allowed constructs in the language it's reading. Defining these rules in a declarative form is much easier, quicker and less error-prone than hand-coding the equivalent parser.

Bryan Ford recently (circa 2004) described a better way to declare these rules in the form of what he called Parsing Expression Grammars -- PEGs.

NOTE: We used to declare these parsing rules in EBNF, intended for a recursive descent parser (or an LL or LALR or other parser). And before you drown me in comments of "They so still use that, dude!" -- I know. They do.

In a nutshell, PEGs describe what is expected in the input, rather than the (E)BNF approach of describing what is possible. The difference is subtle but liberating. We'll not go too much into that now -- except to say: PEGs offer a cleaner way to describe languages that computers are expected to parse. If you want to re-program your 13 year old brother, you might not reach for a PEG parser generator, but as we're dabbling here in the confines of computers and the valley of vim, PEGs will do just fine.

A major benefit to PEG parsers is that there is no separate lexical analysis phase necessary. Because PEG parsers 'expect' to see the input in a certain way, they can ask for it in those expected chunks. If it matches, great, move on. If it doesn't match, try another alternative. If all the alternatives fail, then the input doesn't match. Allow for backtracking, and you have all you need to parse 'expected' input.

NOTE: VimPEG is not a memoising (packrat) parser -- not yet, anyway.

A brief overview of the PEG parsing rule syntax

Terminal symbols are concrete and represent actual strings (or in the case of VimPEG, Vim regular expressions) to be matched.
Non-terminal symbols are names referring to combinations of other terminal and/or non-terminal symbols.
Each rule is of the form: A ::= e -> #s

A is a non-terminal symbol
e is a parsing expression
s (optional) is a semantic transformation (data-munging callback)

Each parsing expression is either: a terminal symbol, a non terminal symbol or the empty string.
Given the parsing expressions, ++e1++ and ++e2++, a new parsing expression can be constructed using the following operators:

Sequence: e1 e2
Ordered choice: e1 / e2
Zero-or-more: e*
One-or-more: e+
Optional: e?
And-predicate: &e
Not-predicate: !e

A Conceptual Model of VimPEG

There are three players in the VimPEG game:

The VimPEG Parser Generator (Vim plugin)
The Language Provider
The Client

The VimPEG Parser Generator

This is a Vim plugin you'll need to install to both create and use VimPEG based parsers.

The Language Provider

This is someone who creates a parser for a new or existing language or data-structure. They create the grammar, data-munging callbacks, utility functions and a public interface into their 'parser'.

The Client

This is someone who wants to 'use' a parser to get some real work done. Clients can either be Vim end-users or other VimL coders using a parser as a support layer for even more awesome and complicated higher-level purposes.

There are five pieces to VimPEG

The VimPEG library (plugin)
A PEG Grammar (provider-side)
Callbacks and utility functions [optional] (provider-side)
A public interface (provider-side)
Client code that calls the provider's public interface. (client-side)

Our Parsing Example

Let's return to our parsing example: recognising (and eventually evaluating) a sum series of integers.

Examples of our expected Input

123
1 + 2 + 3
12 + 34 + 56 + 78

A traditional CFG style PEG for a Series of Integer add & subtract operations:

  Expression  ::= Sum | Integer
  Sum         ::= Integer '+' Expression
  Integer     ::= '\d\+'

In the above PEG for matching a Sum Series of Integers, we have:

Three non-terminal symbols: 'Integer', 'Sum' and 'Expression'
Two terminal symbols: \d\+ and '+'
One use of Sequence with the three pieces: 'Integer' '+' 'Expression'
One use of Ordered choice: 'Sum' | 'Integer'

NOTE: The original (and actual) PEG formalism specifies the fundamental expression type as a simple string. VimPEG shuns (at probable cost) this restriction and allows regular expressions as the fundamental expression type. Original PEG grammars use / to indicate choice, but VimPEG uses | instead.

Anyone familiar with CFG grammar specifications will feel right at home with that example PEG grammar above. Unfortunately, it isn't idiomatic PEG. The thing to be parsed here is a list. PEGs have a compact idiomatic way of expressing that structure:

  Expression   ::= Integer (('+' | '-') Integer)*
  Integer      ::= '\d\+'

Here the arguably simpler concept of iteration replaces the CFG use of recursion to describe the desired list syntax to be parsed. It's so much simpler that it seemed a waste not to bundle subtraction in with the deal. Now our parser can evaluate a series of integer add and subtract operations.

The VimPEG API

  peg.e(expression, options)                  "(Expression)
  peg.and(sequence, options)                  "(Sequence)
  peg.or(choices, options)                    "(Ordered Choice)
  peg.maybe_many(expression, options)         "(Zero or More)
  peg.many(expression, options)               "(One or More)
  peg.maybe_one(expression, options)          "(Optional)
  peg.has(expression, options)                "(And Predicate)
  peg.not_has(expression, options)            "(Not Predicate)

Defining the Series of Integer Add and Subtract Operations PEG

  let p = vimpeg#parser({'skip_white': 1})
  call p.e('\d\+', {'id': 'integer'})
  let expression =
        \ p.and(
        \   [ 'integer',
        \     p.maybe_many(
        \       p.and(
        \         [ p.or(
        \           [ p.e('+'),
        \             p.e('-')]),
        \           'integer'])),
        \     p.e('$')],
        \   {'on_match': 'Expression'})

This example demonstrates several aspects of VimPEG's API:

Elements that have been 'identfied' (using the 'id' attribute) can be referred to in other expressions. 'Integer' is identified in this case and referenced from 'Expression'.
Only root-level elements need to be assigned to a Vim variable. In this case, the 'expression' element is considered to be a root element -- we can directly call on that element now to parse a series of integer add and subtract operations.
Intermediate processing (for evaluations, reductions, lookups, whatever) is achieved through callback functions identified by the 'on_match' attribute. The 'Expression' rule uses such a callback to iterate the list of add or subtract operations to evaluate their final total value. Here is that callback function:

  function! Expression(args)
    " initialise val with the first integer in the series
    let val = remove(a:args, 0)
  
    " remaining element of a:args is a list of [ [<+|->, <int>], ... ] pairs
    let args = a:args[0]
    while len(args) > 0
      let pair = remove(args, 0)
      let val = (pair[0] == '+') ? (val + pair[1]) : (val - pair[1])
    endwhile
    return val
  endfunction

The public API interface

  function! EvaluateExpression(str)
    let res = g:expression.match(a:str)
    if res.is_matched
      return res.value
    else
      return res.errmsg
    endif
  endfunction

The res object holds a lot of information about what was actually parsed (and an errmsg if parsing failed). The value element will contain the cumulative result of all the 'on_match' callbacks as the input was being parsed.

Using it

  echo EvaluateExpression('123')
  echo EvaluateExpression('1 + 2')
  echo EvaluateExpression('1 + 2 + 3')
  echo EvaluateExpression('4 - 5 + 6')
  echo EvaluateExpression('1 - a')

NOTE: The last example there will return the error message: 'Failed to match Sequence at byte 2'. This might seem unexpected -- we might have been hoping for something more meaningful about not expecting an alphabetic character when looking for an integer digit. It's telling us that (after gracefully falling back out of the optional series of add and subtract operations) it can't match '$' (end of line) at byte 2 because a '-' character is in the way.

Not terribly exciting, granted, but hopefully this serves as a reasonable introduction to the VimPEG Parser Generator. What can you do with it? I look forward to seeing weird and wonderful creations and possibilities in Vim now that real parsing tasks are more accessible.

What's Next?

As beautiful (ok, maybe not, but I've seen more hideous interfaces) as VimPEG's API is, she could do with a touch of lipstick. Instead of calling the API directly, it would be nice to be able to declare the rules using the PEG formalism. That's exactly what Raimondi has done in one of his contributions to VimPEG and that's what we'll be talking about in the next article.

In a future article I will show an example of sugar-coating the VimL language to make function declarations both a little easier on the eyes and fingers as well as adding two long-missing features from VimL -- default values in function parameters and inline function
declarations, a la if <condition> | something | endif .

Tuesday, June 5, 2012

Now You Know When To Macro

Use macros to reformat multi-line text
that would be too finicky
with :ex's linewise commands.

What Are Macros?

Macros are just recorded keystrokes of what the user would normally do manually when faced with the given editing task. As such, they do not represent much of a conceptual challenge and therefore the barrier to adoption is quite low. Once users gingerly try their first few macros, they're hooked and look for new and interesting screws to bang with their dull new hammer.

Regular expressions, on the other hand, are a mess of complicated hieroglyphics to the layman. Like nothing they've used or done before, they appear opaque and unfathomable. The meaning of various atoms within a regular expression are overloaded (they take on different meanings within different contextual sections of the expression) which only adds to the surprise and frustration of those who make the effort to learn them. Let's not even mention Vim's silly variable magic-ness.

For a great many editing situations, though, regular expressions are the sharpest tool in a Vimmer's toolbox. It slices and dices text cleanly and, when you master them, efficiently. I urge every Vimmer to vault the regex wall; glory awaits you on the other side.

For the right task, however, macros are the right tool - a better tool than regular expressions (alone).. But... what is the right task? In my opinion, macros should be used for multi-line edits where Vim's :ex commands (the :substitute (regular expression driven search & replace command) among them) prove fragile and stubborn.

A quick duck around the net revealed several good explanations and examples of macros being used correctly. It also produced a few cases of macros being used where the better choice would have been one or several regexes.

I won't bore you with yet another contrived example here, I'll simply point to some good ones:

Actually... As I was readying to publish this, I just noticed a macro that I always use. Moving to the first paragraph, I type:

qavipJ}jq

I then run it on the second paragraph with @a and the subsequent ones with @@. I don't make it crawl over the whole file automatically because the asciidoc multi-line headings, lists and admonition blocks get messed up with this macro. for the length of my articles and the builtin } command to move to the next paragraph, it's not a big deal.

Recursive Macros

One neat trick about macros is that they can be recursive. This is often used to make the macro auto repeat without having to specify a range or count of lines when executing it. While it is fairly trivial to specify a range of lines for a macro to execute over, I'd suspect that if it _were_ that easy then using a macro in the first place was probably the wrong choice. With a macro that utilises a find up front to set the position of the subsequent edits and then calls itself as the last step, you have a simple and neat solution that finds and changes all occurrences for you.

Here is a simple example of a recursive macro:
http://dailyvim.blogspot.com/2009/06/recursive-macros.html

And I've discussed recursive macros before too.

Cases Where Macros Were The Wrong Tool

Of the many poor examples of macros, I'll highlight why I consider macros to be a poor choice with the following two cases (chosen randomly):

http://www.joeldare.com/wiki/linux:vim_macro

The source data is all on separate, unrelated lines. This is a clear signal that plain :ex (in this case, a regex :substitute) should be employed instead of a macro.

Original source lines:

First
Last
Phone

Desired destination lines:

<label>First:</label><input type="text" name="First">
<label>Last:</label><input type="text" name="Last">
<label>Phone:</label><input type="text" name="Phone">

Simple regex solution:
:%s/.*/<label>&:<\/label><input type="text" name="&">

http://www.oreillynet.com/mac/blog/2006/07/more_vim_save_time_with_macros_1.html

Again, the source data is strictly single-line, requiring again, a simple :substitute.

%s/^\d\+ -- $.*$$ (\d\{4})$$/<li><em>\1<\/em>\2<\/li>/

.Follow up with some normal commands to wrap the list:
o</ol><esc>
<c-o>O<ol><esc>

I've tried to show in this article the following things:

Vim's macros are cool when used in the right place - where the source involves complex, interdependent multi-line text.

Don't use a macro when the source does not consist of multi-line, interdependent text. Use an :ex command, like :substitute then.

LEARN regular expressions. They'll save you a LOT of time in the long haul.

The Art of Edits I - Weaponry

When faced with vicious hordes of marauding tweaks, ghastly repeats and soul shaking changes to the landscape that would make a lesser mortal wine and pule under their desks, Vimmers do not wince or cringe or shy away from their keyboards. This is their domain. This is what they've been practising for. The accomplished warrior does not foolhardily rush in to battle without first examining his lot carefully. What is the nature of the beast? What weapon shall I vanquish it with? How many scalps will I collect before my anime starts?

The Armoury

Various weapons adorn the armoury of the Valiant Vimmer:

Edits

A small hand-held weapon, most effective in cramped and dynamic situations. CAUTION: Sole reliance on this weapon will cause fatigue. Remember to sheathe your Insert Edits when you're not using them. These are the normal, insert and visual mode commands that constitute a lot of ordinary editing work. Read the left column of topics in the table of contents of :help quickref for a summary of awesome tips, tricks and techniques this weapon offers.

Abbreviations

Grips and tactics to wield Edits more gracefully. These are rehearsed and prepared shortenings that automatically expand out to longer forms while we're typing. They're available in both insert and command-line modes. Read :help :abbreviate for the craft and maintenance of this weapon.

Completions

A mystical amulet that bestows upon the wearer the ability to complete an attack automatically. Adept users of this power can call on recent battlefield attacks, those from historic battles, from battlefield manuals or even from an omniscient guiding hand. Read :help ins-completion for access to this gem of valour.

Regexes

The most devastating hand-held weapon in the Vimmer's arsenal. Due to its ungainly size and the need to retreat from battle while preparing an offensive with this weapon, it is not recommended for close quarter or highly dynamic combat. Often best suited for ranged attacks. The adept Vimmer is an effective user of Vim's regular expressions. To sharpen your regex skills, read :help pattern.txt and you can practise with VimRegexTutor.

Ex

One of a warrior's greatest strengths is the ability to command his troops amidst the frenzy of battle. Ex is the powerful tool Vimmers use to coordinate and execute larger strategies on the battlefield. Raining fire down upon your enemies from afar, changing the very landscape beneath their feet, and even travelling in time are but a few of the awesome powers at the command of a Vimmer versed in Ex. These are all of the : (colon) commands in Vim. These commands are inherently linewise and can therefore be ungainly on multiline monsters (macros are the recommended weapon when facing multi-lined monstrosities). Read :help ex-cmd-index for an overview of battlefield orders you can issue.

Maps

The map is a ranged weapon providing a sortie of commands with the flick of a wrist. Skilled Vimmers spend hours in training creating map weapons for special purpose attacks. The builtin Edits occupy a vast majority of the available body (keyboard) space. Careless novices unwittingly replace valuable original Edits with inferior maps, only to be defeated or fatigued in extreme battle. The master Vimmer chooses his maps carefully, regularly preferring to place them on the purpose-built shield (:help mapleader) so as not to interfere with his perfected mastery of the native Edits. Read :help map.txt to start crafting your own maps of devastation.

Macros

Macros are pre-rehearsed battle maneuvers that can be performed instantly with precise execution. A powerful weapon but dangerous when employed on the wrong terrain. The macro will blindly repeat the instructions it was established with. Regardless of whether this weapon is fired when facing the wrong enemy or even friendly troops, it will charge headlong in to battle until its target is demolished or it has been overcome (an error occurs). Here are several good explanations of macros and their uses.

VimL

The arcane language of the truly wizened Vimmer, capable of wreaking havoc on even the most worthy opponent. Once a spell is cast its effects are almost instantaneous, but the incantation process is long and complicated. This discipline is better practised at quieter times of relative peace whereupon its power may be distilled into volatile potions, sacred scrolls, charged wands and ready tomes that can be more freely and immediately used on the battlefield. Such artefacts in the land of Vim are called plugins. To begin on your path as a Vim Mage, read Damian Conway's excellent introduction to Vimscript. The accomplished Vimlglot might prefer a deeper analysis. The holy arcanum is kept in :help usr_41.txt.

Veteran Vimmers and Alacritous Acolytes alike are well advised to review their personal arsenal frequently, polishing tarnished weapons and sharpening dull and forgotten ones. Develop a regimen of daily practise in each of these tools and skills so that you may face your next editing evil with brave heart, right mind and quick hand.

Being virtuous in preparation and valiant in battle,
the nature of a Vimmer is irrepressible!

Monday, June 4, 2012

Advanced Macros in Vim, Commentary

This is a bloated comment on: http://blog.sanctum.geek.nz/advanced-vim-macros/

Sharks in Shells

Firstly, there are at least two plugins that can tabulate text in vim: Align and Tabular. Using plugins written in pure vimscript is usually better than shelling out to system tools, because it's more OS independent. On that same note, Vim has a builtin :sort command, so no need to shell out for that either.

You warn against reaching for the hammer when holding screws; that it's sometimes better to use a splash of :ex and a sprinkle of VimL. Indeed, that is true in this example (starting with the cursor on the first line of the original data table):

:Tabularize / \+/
:sort
:%s/\d\{4}/\=strftime('%Y')-submatch(0)/

Mentioning Macro Mechanics

Yanking a macro line from your buffer into a register with "ayy captures the trailing ^J which, if you have something bound to <Enter>, will produce unwanted side-effects. Use ^"ay$ instead (or the VimLocalMacros plugin).

When assigning macros in a let expression, the double-quote form does allow for Vim map-style markup:

let @a = 'must have ^M literal chars'
let @a = "can have \<cr> escaped markup"

There's no inherent benefit from moving this sort of macro into a function - it's so uniquely dependent on your current text file (and year!) that it simply doesn't make sense in this case. However, this trick might be useful in more generic situations. Using a function to preserve macro contents allows us to reuse the register that macro was occupying. Personally, I use macros on an ad hoc basis, crafting them quickly for the need at hand and discarding them when done. I might overwrite register @a several times over within a single edit session with various macros on an as needed basis. However, I was asked one day how to persist particular macros for a given file. The embedded-in-a-function approach shown in the article is one solution. I also wrote VimLocalMacros for this purpose.

Macros stop executing if an error occurs. You can use this to your benefit. Design your macros so they deliberately fail - easy way to not have to count and just provide a large-enough number of runs: 999@a

Recursive Macros have a use and are not something to be dismissed as the wrong tool instead of reaching for a scripting language.

Sunday, May 13, 2012

My Vimrc

bairui, y u no?

I've been asked before why my ~/.vimrc is not available online.

I guess, deep down, there are probably feelings of apprehension about releasing something so personal and close to one's heart; that insensitive malcontents might throw stones through the windows and graffiti the walls.

A bigger and more real concern is that of the endemic affliction known as Cargo Culting. Vim is like a martial art. Mastery comes from years of rigorous, dedicated practice, lots of trials and accidents and occasional brilliant successes, study at the feet of other masters and plain old Time at the coal-face. There is no short-cutting this process. You can speed it along with the right approach, but you can not just dump a master's vimrc file at $HOME and think you're playing with the big boys now. That will only lead to tear-stained keyboards and hosed code.

Heh... I'm distorting the truth only slightly here, but I was just told a funny story that allegorises this tragedy: A good friend and fellow vimmer had his Vim set up with :cursorcolumn=80 to remind him where to constrain his code lines. A colleague passing by gasped, "Oh no! Your monitor is broken!". Upon explaining that everything was ok, that the monitor was indeed working fine and that Ti... er, I mean, my friend wanted the line there, the enquirer left mollified and quietly envious of this awesome feature that his Ec, er, I mean, editor didn't have. A week later, my friend discovered his colleague's computer had a dark red line too; only, his was drawn on a transparent plastic strip sticky-taped to the monitor.

:-)

In fairness... the colleague was mocking my friend, so this is not really a story of Cargo Culting, but it shows the phenomenon well, I think. Someone sees something awesome in another person's setup and blindly copies it in the vain hopes it will bring them equal fortune too.

While not a Vim Master, I do have the occasional sharp implement that is best kept away from curious beginners. To that end, I am not just pushing my vimrc up for you to copy blindly. I give you here a snapshot in time. I have annotated it in the hope that it shows you why I chose a certain option or implemented a particular solution. It is my intention that you learn how to wield this cutting edge tool with ample guidance and clear models.

So much for the What and Why. Now a little bit about the How. An expert is not born expert. Years of training go into the growth of his skills, techniques, tools and knowledge. My vimrc has grown over the years in the same way, from humble beginnings through proud days of development and embarrassing times of foolishness all the while with unremitting hubris.

You can see examples of that growth in the snapshot below. In various places I note pieces that might eventually grow to the point of becoming their own plugin. When they do, they will be cut out of my vimrc and put into a standalone plugin of their own. Doing this helps keep the core vimrc down to a manageable size. It also allows me to share it more easily with others.

Finally, the process of cleaning up my vimrc for public release was quite a long and intellectually grueling one. It generated the philosophical challenge: what is my Vim configuration? By this I mean, it occurred to me that my vimrc was not the entirety of my Vim configuration. I have extracted and encapsulated within plugins many times over productive growths from my vimrc throughout the years. They serve me well still, sitting mostly quietly at the sides, doing their jobs unnoticeably well. To forget these pieces and not include them in my configuration would be disingenuous. They provide many of the commands that my muscle memory relies on within Vim today. They are my Vim configuration; certainly as much as my vimrc can claim to be, anyway.