OpenBSD Mail Box: vi: count occurrences of a substring

On 04/09/2021, Marc Chantreux <mc@unistra.fr> wrote:
> hello,
>
>> :!sed s/abc/abc\n/g % | grep -c abc
>
> Note: in sed, "what i just matched" is noted &

Oh, that's good, thank you.

*Shoulda seenit on the man page -- butta dinnt.*

From sed(1):
> An ampersand ('&') appearing in the replacement is replaced by the string matching the regular expression.

>> Googled information suggests that the opposite of what's described in
>> the man page may be true: You CAN use a literal newline, but you
>> can't use \n.
>
> BSD sed is more litteral AFAIK so you need to escape a real 0x10 but
> both GNU and BSD support escaped newlines:
>
> sed 's/abc/&\
> /g'

So is this incorrect? <https://man.openbsd.org/sed#SED_REGULAR_EXPRESSIONS>:

> The escape sequence \n matches a newline character embedded in the pattern space. You can't, however, use a literal newline character in an address or in the substitute command.

> This doesn't help in vi so you can fake it for a moment using tr:
>
> sed 's/abc/&œ/g' | tr œ '\n'

Like I mentioned elsewhere, I'm *really* not a fan of the "let's
assume the input won't contain THIS" approach, and like I also
mentioned in another mail, I would prolly write a script absent
superior alternatives, so yeah, I agree:

> Another solution is to write commands for this kind of tasks:
>
> <<\. cat > ~/x
> #! /bin/ksh
>
> sed -r 's/a/&\
> /g'
> .

Wait, hold up, I'm not familiar with this input redirection idiom.
Could you explain? Why the double <, and why does it not work with a single <?
Also, could you explain the escaped period?[0] This is very hard to google.

> then from vi, :w !~/x

While we're at this, we should probably try and complete the script
(which also needs the chmod +x treatment), so this will make more
sense to all.
Like so:

#!/bin/sh
sed -E 's/'$1'/&\
/g' | grep -c $1

Then, from the shell:

$ <FILE count abc

Or from inside (n)vi:
(I named my script count and put it in ~/bin/, which is in my PATH.)

:!<% count abc

Or ":w !<% count abc", which is arguably better and just a tiny bit longer.
But there's no shame in ":!cat % | count abc" or ":!cat FILE | count
abc" either. The best invocation is the one you remember. *Now
KISS.*

An idea might be to use $@ instead of $1 in the script, but I haven't
really thought through the implications, and I'm not sure how to
reliably quote that after grep -c. If anyone wants to opine on that,
shoot; but I'll prolly leave this for now. $1 feels safer and more
KISS-compliant as well.

>> literal carriage return, not a literal newline (^J). That's the case
>> on Linux as well, and I don't know why.
>
> neither do i.
>
>> Your new subject line is slightly imprecise, as words are usually
>> whitespace-delimited, and I was "looking for a way to count
>> occurrences of
>> 'abc' in FILE". Not every substring is a word.
>
> right ... wasn't thinking that much to the name. sorry :)

Not to worry, and thank you so much!
Ian

> regards
> marc
>

fn:
[0] as the actress said to the ST ven... never mind

OpenBSD Mail Box

Saturday, September 04, 2021

vi: count occurrences of a substring

No comments:

Post a Comment