Page MenuHomePhorge

Fix exception and error message rendering some markup matching figlet/cowsay regex
AbandonedPublic

Authored by aklapper on May 20 2023, 15:31.

Details

Summary

The remarkup syntax to generate a figlet and cowsay is basically one of:

  • interpreter {{{text}}}
  • interpreter (options) {{{text}}}

However the corresponding regex matches on any preceding string for interpreter, such as a {{{a}}} a.

This very generic parser was designed to render a "No interpreter found" error message when your "interpreter"
was not already registered. But, in some cases it was also causing a:

RuntimeException: Undefined array key

Since it's probably not nice to explode in these cases, and since it's not nice to replace the user input with a
potentially confusing error message if you want just say foo {{{bar}}}, we just try to be even more nice:
we preserve the user input as-is, we still allow full flexibility to add custom extensions with weird parsers, but,
we log potential problems, to still make it known that something is probably not OK.

Your weird comments are safe now.

Long life to cowsay and figlet, BTW.

Proof of concept:

Phorge Cowsay Figlet showcase.png (891×837 px, 66 KB)

Closes T15372

Test Plan

Applied this change, on top of D25142. Afterwards, cowsay and figlet still render, and entering a {{{a}}} a does not show a "No interpreter found" error message anymore. Instead it just keeps the text as-is.

If you care about anything, you can still discover the related issue in the error log.

The original message is kept as-is, preserving your newlines.

  • Try creating a Comment with a cowsay and a figlet
  • Try again enabling self-action email (so you receive an email with cowsay and figlet)
  • Try with HTML Email format and also without that (pure ASCII)

Diff Detail

Repository
rP Phorge
Branch
arcpatch-D25234_2
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 459
Build 459: arc lint + arc unit

Event Timeline

Apparently we need some time on this little puppy class

src/infrastructure/markup/blockrule/PhutilRemarkupInterpreterBlockRule.php
43–45

Thanks to your patch I discovered this $interpreters variable here that is apparently designed to support a very extensible system to support whatever weird custom interpreter. So probably we cannot just hammer cowsay and figlet in the regular expression, otherwise we block extensions, and we should preserve the good things from the person who over-engineered of this parser.

This revision now requires changes to proceed.May 20 2023, 18:27

Premising I very like the strategy in this patch that is "don't parse unuseful stuff" in order to "don't suppress weird errors", so, causing to preserve the user input.

I'm trying to work on a way to keep the possibility to implement whatever generic "interpreter" but without altering the user input just because we have not found any weird "interpreter". For these cases, logging could be really enough.

I discussed with aklapper and, hoping to be useful to save the time of this precious contributor, I will update with the patch that we have seen together in live during Wikimedia Hackaton (so you don't need to amend and etc.)

Allowing extensions to override things, but, stop replacing user input just to show weird errors (and log instead)

valerio.bozzolan edited the test plan for this revision. (Show Details)

Premising that we invested so much efforts in over-engineering this stuff, and premising that this is objectively good:

Phorge Cowsay Figlet showcase.png (891×837 px, 66 KB)

I mark this as approved by O1 since it fixes a crash, it fixes a workflow, it improves the documentation.

Thanks for any other contribution to further improve the situation.

This revision is now accepted and ready to land.May 20 2023, 20:16

Uhm. We are breaking something here. I'm testing with:

figlet {{{
hi
}}}

cowsay {{{
hi
}}}

cowsay (eyes=pp) {{{
asd
}}}

And testing self-email, I get:

Cowsay figlet in Phorge before D25234.png (397×313 px, 8 KB)
Cowsay figlet in Phorge after D25234.png (651×350 px, 13 KB)

And after the patch, the texts in ASCII mode are somehow duplicated

This revision now requires changes to proceed.May 25 2023, 14:36
valerio.bozzolan added a subscriber: avivey.

I absolutely don't know what is happening here, but now it works again after a rebase.

+1

Maybe also @avivey want to give a look

This revision is now accepted and ready to land.May 25 2023, 14:54
avivey requested changes to this revision.May 26 2023, 06:34

I don't see how this stops the exception discussed in T15372 - nothing in this change looks like it might stop an "Undefined array key" exception?

src/infrastructure/markup/blockrule/PhutilRemarkupInterpreterBlockRule.php
84–88

I don't think this log is useful - it's a user error, not a program error, and the user will not see it.

This revision now requires changes to proceed.May 26 2023, 06:34
In D25234#7296, @avivey wrote:

I don't see how this stops the exception discussed in T15372 - nothing in this change looks like it might stop an "Undefined array key" exception?

I need some time to also clarify that specific question, but I can reproduce the current issue with this Comment:

Ciccio {{{Foo}}} indeed

Try, and you will not be able to save a new Comment. With the proposed patch, it does not happen anymore.

Ah yeah indeed now I understand the issue:

The parser was matching some line(s), but eventually returning an error message with potentially a different number of line (1). So, now it seems this also matches the expected lines since I cannot reproduce that problem anymore.

Where was it "returning an error message with potentially a different number of line" ?

In D25234#7303, @avivey wrote:

Where was it "returning an error message with potentially a different number of line" ?

As far as I've understood, line 81 was returning the error $message that has a fixed length ("No interpreter found", 1 line) but that was not handled correctly since it's length was not matching the one returned from getMatchingLineCount(), that does not take into consideration potential errors.

So, with this patch we do not need anymore to fix the getMatchingLineCount() method, since now it is more consistent with the returned lines. This should be now true with all cases, not just non-erroneus cases.

That sounds very sketchy - getMatchingLineCount should either refer to raw text, or to post-processed text - and the 2nd option isn't possible (There might not be any text in the result, like in macro).

So I would expect the output of getMatchingLineCount not to depend on the output of markupText at all - definitely not force the implementation to have a specific "line count".

Just to clarify: I think we both agree on these four things:

  1. showing a "No interpreter error" to the frontend could be weird, at minimum this is true for MediaWiki users (not just Wikimedia users) and showing the original raw text could be more useful (green light)
  2. after this ↑ it has sense to just log the problem (green light)
  3. this Diff works and also "somehow" fixes an error (green light)
  4. it could be really interesting to discover where was the root problem (→ let's dig)

With these premises, I totally agree that the n. 4 is scarying. It seems the business logic reading getMatchingLineCount() is inside PhutilRemarkupEngine::splitTextIntoBlocks().

I would really thank you for whatever additional note you eventually discover. I will do the same.

Just to clarify: I think we both agree on these four things:

  1. showing a "No interpreter error" to the frontend could be weird, at minimum this is true for MediaWiki users (not just Wikimedia users) and showing the original raw text could be more useful (green light)

I'm not sure I agree with this. This case is strange enough that I think anyone who got here, got here by trying to use this feature, not by mistake. In that case, showing an error is clearer then just passing the input as is (and not interpreting whatever remarkup it has in the body!)

  1. after this ↑ it has sense to just log the problem (green light)

I don't agree with this - as I said, this is a user error, and logs are generally only shown to administrators.

  1. this Diff works and also "somehow" fixes an error (green light)

I'm not sure it "fixes" the problem or just "hides" it - that depends on the definition of the problem.

  1. it could be really interesting to discover where was the root problem (→ let's dig)
src/infrastructure/markup/blockrule/PhutilRemarkupInterpreterBlockRule.php
84–88

I see your point. Premising that phlog() is useful to server operators, but it seems is useful also to DarkConsole users to understand what is going on.

Last word to you. I can surely remove that if you still dislike. I have not any strong opinion here 👍

src/infrastructure/markup/blockrule/PhutilRemarkupInterpreterBlockRule.php
84–88

I'd rather see this class unchanged, and the root cause of the exception handled.

The underlying issue is way more contained now that rP7868ab3754fa is merged. I'm abandoning this revision to clean up my backlog.