GNOME Bugzilla – Bug 69813
Parentheses should not be optimised out
Last modified: 2008-08-06 10:36:02 UTC
Both the Excel importer and the manual entry of formulae remove "redundant" parentheses from expressions. This is not the Right Thing because as well as being a mathematical expression in a computer, the formula is visible to the user and may later be edited again. For a parser it is obvious that (a*b)+(c*d) == a*b+c*d but for a user it is NOT obvious that this is the case, and they will be distressed when their readable expression is transformed into an (equivalent) hard to read string of symbols. What I did: Typed =(5*3)+(6*7) and pressed RETURN or ENTER :) What I expected: Gnumeric shows =(5*3)+(6*7) in edit widget, 57 in cell What happened: Gnumeric shows =5*3+6*7 in edit widget, 57 in cell The same thing happens to imported Excel formulae, the expression parser explicitly ignores Excel's PTG for these "unnecessary" parentheses.
This is a common request. When we revisit the expression representation we'll see what can be done to store the parens.
While I'm poking bugs into GNOME bugzilla, any progress on this? Is there a specific blocker, or is just low priority?
Nothing blocks it. It's all a matter of priorities and this one hasn't caught the interest of anyone.
Created attachment 106882 [details] [review] Support parentheses used for clarity
The above patch compiles and survives some brief testing. To try it out: Just add some parentheses to an expression that aren't strictly needed. They should now be retained internally, and survive being round-tripped through Gnumeric's own file format, Excel, and of course any formats in which we just emit an arbitrary expression string. Please commit this to Gnumeric trunk. Since it touches the core code this would benefit from more thorough testing but if there are any outstanding problems they should be easy to identify and fix.
Looks very promising. Could you please look into why the lookfuns.xls test fails (t1009-lookfuns.pl) before I commit?
Comment on attachment 106882 [details] [review] Support parentheses used for clarity Looks good with some minor tweaks. The change of flags within eval looks questionable. Do you have a test case to back that up ? I would have expected return gnm_expr_eval .... in ms-formula-write please use FORMULA_PTG_PAREN, and later when testing use xl_op != PTG_PAREN rather than using != 0. I'm a bit unclear on how to differentiate explicit parens, and sets but we can explore that when sets get cleaned up.
Created attachment 106928 [details] [review] Updated patch to fix above concerns Updated patch. Addresses Jody's concerns above, hopefully. Also works around Morten's reported Excel import trouble by suppressing parentheses when Gnumeric has already created a "set". The real long term fix here is to re-consider the situation with Gnumeric's sets vs Excel's union operator. I suspect that Excel's AREAS() function is more sophisticated than our tests allow for as well. But, we pass all the same tests with this patch as without. [ NB lookfuns.xls can't be roundtripped by Gnumeric currently with or without my patch, again due to sets vs unions I think ]
In case it's helpful, and before I forget... the following is based on reading people's Excel tips, a copy of their user guide from some time when Windows had 8.3 filenames and a bit of head-scratching. It might help Jody. Excel only has explicit parentheses (of course it synthesises the other kind for display as Gnumeric does). However its unions don't require parentheses in normal use (whereas Gnumeric's sets seem to include parentheses in their syntax). So explicit parentheses are used when calling a function like AREAS which takes a single reference parameter, as function parameter commas have higher precedence than Union operator commas. So =AREAS((A1:A2)) in Excel has a function call plus explicit parentheses surrounding a union, whereas in Gnumeric it's just a function call and a set. My instinct is that union is more general and worth adopting, at least in the expression syntax. But no doubt other people have spent more time thinking about this than I have.
Oops, that example left a lot to be desired, try this one.. So =AREAS((A1,B2)) in Excel has a function call plus explicit parentheses surrounding a union, whereas in Gnumeric it's just a function call and a set.
The lookfuns test is a bit of a blocker issue for this. It's not important, per se, but a failing test means I cannot create a release, :-/
Doesn't lookfuns test pass for you with the latest patch ? ------------------------------------------------------------------------------- t1009-lookfuns.pl: Check that lookfuns.xls evaluates correctly. | Using exporter Gnumeric_stf:stf_csv | Reading file:///usr/local/src/gnumeric/samples/excel/lookfuns.xls | Excel 97 + | Writing file:///usr/local/src/gnumeric/test/lookfuns.csv Pass As you can see the test passes for me. What happens when you run it?
welinder@toshiba:~/gnome/gnumeric/test> ./t1009-lookfuns.pl ------------------------------------------------------------------------------- t1009-lookfuns.pl: Check that lookfuns.xls evaluates correctly. | Using exporter Gnumeric_stf:stf_csv | Reading file:///home/welinder/gnome/gnumeric/samples/excel/lookfuns.xls | Excel 97 + | Writing file:///home/welinder/gnome/gnumeric/test/lookfuns.csv | #VALUE! Fail. Specifically, D29 and F29 become #VALUE!. This is with the first patch. I'll try the second later.
I do not see the problem with the second patch.
Great, then I think you can commit this and close the bug.
Something is still wrong -- we're looking at it. The change in ms-formula-read.c that drops a parenthesis around a set is wrong. However, reverting that to a regular unary operator brings back the area problem. (The extra parenthesis should have no effect on evaluation, but it does.)
Created attachment 107265 [details] [review] Updated patch This normalizes the read and fixes the AREAS problem. It's not pretty.
Nick: why do we need special handing in ms-excel-write for this?
Hmm, it's been a few months. I will have to page all this back in. Please forgive if my first attempt isn't everything it should be. Gnumeric has a set of operators, my patch adds one new unary operator which makes no functional difference to the calculation engine, but does have precedence, ensuring that parentheses entered by the user are retained even when mathematically redundant. However, adding an operator has some knock-on effects. There are some arrays which use the enumerated operator type as their index. Thus my patch needs to add an element to those arrays for the new operator, and if necessary fix any related logic. In the Excel exporter, the existing code used such an array and assumed that parentheses should be emitted for unary operators whose precedence (in Excel) was otherwise insufficient for the correct formula. I added the array element and ensured that its precedence was chosen so that it would always meet this requirement. But ordinarily the code also emits the unary operator itself, which we don't want to the do for the parentheses. Your latest patch seems to achieve the same thing in this particular region of the code, but in a slightly more readable way. -- If the confusion caused by the difference between Excel's more powerful region union operator and Gnumeric's relatively weaker "areas" concept remains a problem so many months after a patch was available for the core bug, let me recommend a multi-step solution 1. Apply changes only to the core Gnumeric code, avoid any attempt to improve Excel import/ export for now, just do enough to ensure the exporter and importer don't crash on explicit parentheses. This change is very useful on its own, and ought to be able to happen very quickly whether in a stable series or a development series. 2. Add the union operator to Gnumeric. This may be a large piece of work. I am reluctant to volunteer for it, but if there's no-one else and I have agreement in advance that such an improvement would be accepted, I'd knuckle down and try to do it over the next year. 3. Use the new union operator to import and export Excel unions. 4. Fix the Excel importer and exporters to work with explicit parentheses now that the confusion from union operators is eliminated in the parser.
Patch is in, minus the problematic ms-excel-write part. If we have problems with exporting certain expressions because we synthesize parentheses in some circumstances, then we need to fix that and not simply drop all parentheses.
*** Bug 546528 has been marked as a duplicate of this bug. ***