Bug 355659 – CalDAV: New appointments disappear for 1 minute, and then reappear

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 355659 - CalDAV: New appointments disappear for 1 minute, and then reappear


Summary:	CalDAV: New appointments disappear for 1 minute, and then reappear


Status:	RESOLVED FIXED

Product:	evolution
Classification:	Applications
Component:	Calendar
Version:	2.6.x (obsolete)
Hardware:	Other All

Importance:	Normal minor
Target Milestone:	---
Assigned To:	Harish Krishnaswamy
QA Contact:	Evolution QA team

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2006-09-12 21:19 UTC by Andrew McMillan
Modified:	2007-08-23 17:33 UTC

See Also:
GNOME target:	---
GNOME version:	2.13/2.14

Attachments
Fix this by falling back to an index on ETag, which must match because it was server-supplied. (4.48 KB, patch) 2006-11-14 09:34 UTC, Andrew McMillan	none	Details \| Review
Fall back to an index on ETag, which must match because it was server-supplied. (4.48 KB, patch) 2007-05-16 00:34 UTC, Andrew McMillan	none	Details \| Review

Description Andrew McMillan 2006-09-12 21:19:28 UTC

Please describe the problem:
When I add a new appointment to a CalDAV calendar it is initially displayed.  Shortly afterwards the appointment disappears for 1 minute, and then reappears and remains visible thereafter.

Steps to reproduce:
1. Add appointment to CalDAV calendar
2. Watch appointment disappear
3. Watch appointment reappear


Actual results:
When I add the appointment, Evolution does a "PUT" to send the new appointment to the server.  Shortly thereafter (on a regular schedule) Evolution makes a "REPORT" request to the server, sees the new appointment is there and consequently does a "GET" request to retrieve it.

At (roughly) this point the appointment disappears from the displayed calendar in Evolution.

One minute later (on schedule) Evolution again does a REPORT request, followed by a GET request, and the appointment now re-appears on the calendar.

Each minute thereafter, Evolution does a REPORT request, but does not do a GET request - it appears to know for sure about that appointment now.

Expected results:
I would expect Evolution to do a PUT/REPORT/GET sequence of events immediately the appointment is stored, and to know about it thereafter.

I would not expect to see two sequences of REPORT/GET for teh appointment, and I would especially not expect to see it disappear from view.

Does this happen every time?
Yes.

Other information:

Comment 1 Andrew McMillan 2006-10-10 02:47:31 UTC

OK. I have tracked down what is happening here, and it is not trivial...

It seems that Evolution does a PUT and invents an HREF like:

http://user@server/blah/this/that.ics

An ETag is also returned for that calendar resource.

In due course, Evolution makes a REPORT request, which returns the HREF as:

http://server/blah/this/that.ics
An ETag is also returned for that calendar resource.

With Cosmo those two ETags are not identical (that may be a Cosmo bug, or perhaps Cosmo modifies the content in some way), so the behaviour is not seen when you use Evolution with Cosmo.

With my own server an identical ETag will be returned repeatedly unless the server-side item has changed.

So, in Evolution (calendar/backends/caldav/e-cal-backend-caldav.c) the synchronize_cache function takes the following approach:

Construct a hash table of the cache keyed on the cached HREF.
For each <response> in the REPORT:
If the <response><etag> matches the cached ETag
Remove the entry from the hash table
Else
Synchronise the entry from the server.

For each entry in the hash table:
Get the UID of the entry.
Remove any item with that UID from the cache.

So, because the hash is keyed on HREF, and the HREF in the REPORT response from the server differs from the HREF which Evolution has stored in it's cache, there is no match, and the entry is not removed from the hash table.

So the remove loop now runs, finding the newly synchronised entry we just received from the server. This one is _NOT_ looking to remove from the cache based on the (changed) HREF, however, it is using the UID which must be invariant.

So the newly fetched record is now removed from the cache (and from the screen) until the next synchronize_cache run, at which point since it isn't in the cache at all, it doesn't get removed after the fetch from the server.

Subsequent runs are fine, because then the cached HREF and the server's idea of the HREF are in agreement.

Several possible solutions present themselves to me:

(1) In my server I could rewrite the object on the way in, so that the HREF within the iCalendar event gets the "user@" stripped out of it. Given that this property is tagged "X-EVOLUTION-CALDAV-HREF" doing so seems to be an extremely dodgy decision.

(2) In my server I could output URLs in the REPORT response as "http://user@domain/blah/...". This seems a fragile fix, which might work on this occasion but could be incompatible with some other software. It also potentially leaks valid user names in proxies and logs which might not normally be leaked in (e.g.) an SSL connection.

(3) synchronize_cache could be modified so that the hash table is keyed on UID, rather than on HREF. While this seems superficially to be the best solution, with the least amount of impact on the existing code. The UID is unfortunately an optional value in the iCalendar specification, so is not always guaranteed to be present. It's OK to use it in the above code because at that point in the code we know that we are just dealing with Evolution, and in particular with the cache, where UID will always be set.

(4) Evolution could be changed to do the synchronise_object call _after_ all mismatched items have been removed from the cache. This seems to be a reasonable solution, but would involve some additional coding to construct the list of items to later be synchronised.

(5) Evolution could remove the entry from the hash list if it successfully synchronised that item, deferring any cache cleanup until the next time through.

This seems the best solution to me, but I would like to get a second opinion from someone else before coding it up and submitting a patch. It's also possible that my server is in some way not compliant with the specification on this point, but if so I can't find it!

Many thanks,

Andrew McMillan.

Comment 2 Andrew McMillan 2006-11-14 09:34:14 UTC

Created attachment 76552 [details] [review]
Fix this by falling back to an index on ETag, which must match because it was server-supplied.

OK.  I've managed to fix this bug, or at least reduce it's visibility down from 1 minute to a 'blink'.  It's a lot better, but not perhaps as elegant a fix as I would like.

My current best fix works as follows:

In synchronize_cache I have:
 - added an additional index based on ETag (as well as the one based on Href)
 - Fall back to the ETag index when we can't find the Href in the cache

This is based on the assumption that an ETag should be a reference which is sufficient to strongly identify a calendar resource, which it should be (according to the HTTP/1.1 or CalDAV specifications).  However it is allowable to return a Weak ETag (W/"...") and perhaps this approach should be avoided in that case.  It appears that Apple's Calendar Server _does_ return Weak ETags (all the other ones I have seen do not).

In reality though, even weak ETags should be sufficiently distinctive, and since this still forces a call to synchronize_object in this case I don't believe we can actually go wrong.

This still doesn't seem as clean to me as having a function to canonicalise URLs in a stuctured manner, and there is probably already one somewhere in the various Gnome libraries that does that, but I don't know where you'd find it.

If you aren't seeing this bug (there is a comment from someone on bug 354855 who doesn't) it is probably because your calendar server is running on a non-default port, or because the URLs supplied by the calendar server are massaged client side (e.g. by the suggested patch to bug 354855).  This problem will only occur when the client and server choose different approaches for representing URLs, such as by one unnecessarily including ":80" or ":443" for a port reference.

I still believe that both this and bug 354855 would most elegantly be fixed by passing URLs through something which standardised them, rather than constructing some in Evolution, and depending on the Calendar server to return URLs constructed in the same way.  If someone knows of an existing function that does that, please note it here (or on bug 354855 :-)

Thanks
Andrew McMillan.

Comment 3 Andrew McMillan 2006-11-26 01:35:42 UTC

In a private e-mail, Christian Kellner comments:

I totally see the point of that patch but I would prefer to use the UID
instead of the ETag because the etag might have changed and the UID is
much more reliable. I also don't think it will ever happen that a object
doesnt have a UID because I think thats the main unique object
identifier for most clients/servers. In case the UID isnt there we just
cry for a second and assume the object is new because after Evo has seen
the object it will have a UID anyway (maybe we should even write that
back instantly .. needs more thinking). This also brings me to the point
where I am not sure if we shouldn't just use the UID for the object
matching and forget about the href. Having redundancy here only impacts
performance. Opinions? (Either I or you should copy/paste that to the
bug)

Comment 4 Andrew McMillan 2006-11-26 02:01:08 UTC

I actually initially started to code this with UID, but switched to use ETag on my second attempt since I wasn't confident that UID was guaranteed to be available.

I'll code a version with UID and confirm that also works.

Thanks for the review,
Andrew McMillan.

Comment 5 Andrew McMillan 2006-11-26 02:44:52 UTC

OK.  The problem with using UID (I think this was where I got to before, and probably why you never coded it that way in the first place) is that the response to the REPORT from the server doesn't contain UID, so when we loop through the report response we only get a choice of looking for HRef or ETag.

The ETag approach is safe enough, however.  The only time we will actually find by ETag will be the first REPORT pass after we PUT a new event onto the server.  In all other situations (subsequent PUT events, events that were new on the server, PUT by someone else, and so forth) we will find it just fine by HRef.

Cheers,
Andrew McMillan.

Comment 6 Håvard Wigtil 2006-12-07 19:59:49 UTC

(In reply to comment #5)
> OK.  The problem with using UID (I think this was where I got to before, and
> probably why you never coded it that way in the first place) is that the
> response to the REPORT from the server doesn't contain UID, so when we loop
> through the report response we only get a choice of looking for HRef or ETag.

Andrew: Does this mean that the attached patch is the final patch, or does your patch still need more work?

I'm also seeing this with my implementation in Horde, so I'm hoping for a fix.

Comment 7 Andrew McMillan 2006-12-07 20:41:00 UTC

My patch works, but it really needs gicmo and/or kharish to give it a final review and agreement before it can be committed, I think.

Comment 8 Andrew McMillan 2007-03-20 00:45:44 UTC

I regularly get asked "Hey!  I just added an appointment to my CalDAV calendar but now it is gone!  What happened?"

In general there seems to be some deep cache desynchronisation issue in Evolution CalDAV somewhere (and I'm not sure this is entirely just the problem we see here) because I have seen situations where not all CalDAV entries are showing, and a "rm -rf ~/.evolution/cache/calendar/*; killev" will get everything showing up again.

Would it be possible to get this patch reviewed, and any suggestions for improvement, or approval added here?  In particular some update on gicmo's thinking re comments #3 and #5 would be helpful.

Thanks,
Andrew McMillan.

Comment 9 Ross Burton 2007-03-22 22:21:40 UTC

Karsten asked me to have a look at this.  The patch looks good to me, it does the right thing and doesn't leak.

Comment 10 André Klapper 2007-05-15 15:23:27 UTC

when committing this, the // comment style should be changed to /* */ to avoid compiling problems on other platforms (solaris)

Comment 11 Andrew McMillan 2007-05-16 00:34:41 UTC

Created attachment 88254 [details] [review]
Fall back to an index on ETag, which must match because it was server-supplied.

I have edited the patch to change the comment style to /* ... */

Comment 12 Andrew Ruthven 2007-05-31 20:48:56 UTC

If people are happy with Andrew M's updated patch, then I'm happy to commit it.

Comment 13 Christian Kellner 2007-06-01 12:40:16 UTC

See last comment on bug 354855

Comment 14 Christian Kellner 2007-08-23 17:33:13 UTC

Copied from bug #354855: 

The small story is: we only store the last part of the href (i.e. the filename)
in the  local cache (and the component) and reconstruct the full uri on every
server interaction. I will attach the patch here, and I would highly encourage
everybody to look over it and shout and cry on any issue you can find. Thanks.

[The patch is attached to #354855 here is the link:
http://bugzilla.gnome.org/attachment.cgi?id=94203&action=view]