After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 774299 - [m4-common] Use POSIX regex word boundaries rather than GNU
[m4-common] Use POSIX regex word boundaries rather than GNU
Status: RESOLVED FIXED
Product: jhbuild
Classification: Infrastructure
Component: general
unspecified
Other Mac OS
: Normal normal
: ---
Assigned To: Jhbuild maintainers
Jhbuild QA
Depends on:
Blocks: 774453
 
 
Reported: 2016-11-11 21:41 UTC by Philip Chimento
Modified: 2016-11-15 06:33 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
build: Use POSIX regex word boundaries (1.04 KB, patch)
2016-11-11 21:41 UTC, Philip Chimento
none Details | Review
build: Check sed's word boundary syntax (2.00 KB, patch)
2016-11-13 04:31 UTC, Philip Chimento
none Details | Review
build: Remove dnl lines with m4 itself (1.09 KB, patch)
2016-11-13 23:44 UTC, Philip Chimento
committed Details | Review

Description Philip Chimento 2016-11-11 21:41:03 UTC
Not sure if this is the right product to file bugs with m4-common... Here's a patch required for m4-common on macOS.
Comment 1 Philip Chimento 2016-11-11 21:41:22 UTC
Created attachment 339670 [details] [review]
build: Use POSIX regex word boundaries

This uses POSIX regex word boundaries [[:<:]] and [[:>:]] instead of
GNU-specific \< and \>.
Comment 2 Michael Catanzaro 2016-11-11 22:34:23 UTC
There might not be an appropriate Bugzilla product for m4-common. This is probably fine, let's just CC Allison and Phillip Withnall.
Comment 3 Philip Withnall 2016-11-13 00:00:52 UTC
Review of attachment 339670 [details] [review]:

If I run that on my Fedora machine, I get:

sed: -e expression #1, char 23: Invalid character class name

For anyone wanting to find the man page for these bracket expressions, it's here: https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man7/re_format.7.html
Comment 4 Philip Chimento 2016-11-13 01:15:04 UTC
Hm, looks like \< would be OK on BSD sed if it used the REG_ENHANCED flag, sadly BSD sed doesn't.

I'm surprised GNU sed (not even sed --posix) doesn't have [[:<:]] since it is claimed that that notation is POSIX.

I guess I'll write a configure check for which one to use.
Comment 5 Philip Chimento 2016-11-13 04:31:09 UTC
Created attachment 339735 [details] [review]
build: Check sed's word boundary syntax

Unfortunately GNU sed and BSD sed seem to have no overlapping ways of
indicating word boundaries. This adds a configure check to determine
which is the correct syntax.
Comment 6 Philip Chimento 2016-11-13 04:32:37 UTC
This turned out pretty ridiculous - we could also just delete /\sdnl\s.*/ and be mostly correct...?
Comment 7 Philip Withnall 2016-11-13 13:06:06 UTC
(In reply to Philip Chimento from comment #6)
> This turned out pretty ridiculous - we could also just delete /\sdnl\s.*/
> and be mostly correct...?

Yeah, that is pretty ridiculous. Unfortunately, I think it is necessary to remove dnl lines (and partial lines). Since the problem seems to be the special whitespace character classes which also match the start or end of the line, we could split it into two stages: one which removes ‘dnl’s which start part-way through a line (and therefore have whitespace beforehand); and one which removes ‘dnl’s which start at the start of the line (and therefore don’t).

Something like:
   … | sed -e "s/[[:space:]]dnl[[:space:]].*//" | grep -v "^dnl.*" | …
maybe?

[[:space:]] is in POSIX, and seems to be mentioned in the Apple man page. It works on Fedora, but I am unable to test it on a Mac.
Comment 8 Philip Chimento 2016-11-13 23:42:38 UTC
Come to think of it, that wouldn't work either:

    m4_something(argument)dnl comment flush against parenthesis

We can just pipe the whole thing through m4 itself though, that will remove any dnls exactly like autoconf would by definition, but won't expand any of the autoconf stuff. I examined the output "used" and "defined" files and it seems correct.
Comment 9 Philip Chimento 2016-11-13 23:44:19 UTC
Created attachment 339774 [details] [review]
build: Remove dnl lines with m4 itself

Instead of a non-optimally-portable regex, use m4 itself to remove dnl
lines. Plain m4 without any autoconf extensions will remove dnl, but not
expand any of the autoconf stuff.
Comment 10 Philip Withnall 2016-11-14 00:18:16 UTC
Review of attachment 339774 [details] [review]:

If you’ve checked the before/after output carefully, I guess we can be confident about it. I don’t know enough about how raw `m4` works to know if there are any potential side-effects we need to care about.

Looks OK to commit with this nitpick fixed.

::: Makefile.am
@@ +29,2 @@
 all-local: $(dist_aclocal_DATA)
+	cat $^ | grep -v '^#' | grep -v 'AX_PACKAGE_REQUIRES' | m4 | grep -o '\<AX_[A-Z0-9_]*[A-Z0-9]\>' | sort | uniq > used

It would probably be an idea to add a brief comment above explaining that m4 is being used to strip comments.
Comment 11 Philip Chimento 2016-11-14 06:01:43 UTC
Attachment 339774 [details] pushed as e6c4817 - build: Remove dnl lines with m4 itself
Comment 12 Philip Chimento 2016-11-14 06:04:48 UTC
OK, comment added, and I elaborated a bit more in the commit message as well.

I could see one potential side effect but it's unlikely: Someone could define an M4 macro in one of the autoconf-archive files that has the same name as one of the plain-M4 builtins that autoconf undefines. That file would then probably cause M4 to error out. However, since no M4 builtins begin with "AX_", it's quite unlikely that there would be any collisions.

Thanks for the reviews.