GNOME Bugzilla – Bug 565082
Migration of Camel's summary SQLite data schema, adding two columns
Last modified: 2009-01-12 09:47:21 UTC
This patch adds the capability of allowing the developer to add migration code for in case the schema of Camel's summary SQLite data format changes. It also adds two columns, modified and created, to the summary data being stored. Will attach
Created attachment 124989 [details] [review] Initial patch that implements this
I'm marking this bug as high because the sooner we have this migration code in CamelDB and to sooner we have the two extra columns, to quicker we can start working on proposals like http://live.gnome.org/Evolution/Metadata. You can of course put it back to normal priority. But let's not let too many Evolution versions pass by before we have the SQLite schema the way we'll need it.
Added Bug #565091 which is blocked by this bug. Bug #565091 is a bug for tracking progression on the Evolution metadata support in Tracker.
Philip, your patch, design seems nice to me. I haven't tested, but it should work. Lets test it out, but before that some suggestions. (*) We read/write versions every time. Write is ok, since we anyways write in a transaction. Can we avoid that disk read for version and pass it across or something? (*) We gonna have some changes wrt in-memory tables during commit, to avoid the sqlite3/fsync journals, which is hurting us atm. Just make sure that this fits well with that. Your temp_%Q can be better, if it is a in-memory temporary table, you would save lots of disk write for the temporary.
"CREATE TEMP TABLE IF NOT EXISTS 'temp_%q'" , it is a temporary in-memory table. That's what the "TEMP" that after "CREATE" comes does.
That missed my eyes. Sorry for the spam.
Created attachment 125192 [details] [review] This patch also adds logging deletes But it might not be very finished. I mean that deletes are forever logged this way, never deleted. The idea is of course to trim the list to a certain maximum number of logged deletions. If the Evolution team has any suggestions in this area, please let me know.
Philip, Can you make the APIs to one transaction? Other wise these are 3 sqlite3 calls to disk. Also there is another API where we do bulk deletes, you may have to do the same there also.
(In reply to comment #5) > "CREATE TEMP TABLE IF NOT EXISTS 'temp_%q'" , it is a temporary in-memory > table. That's what the "TEMP" that after "CREATE" comes does. > AFAIK, TEMP TABLE does not guarantee to be a in-memory table. It is just that the contents of this table are not stored inside the opened database file and in a temporary location and the tables are removed once the connection is closed. So INSERT INTO temp-table will still cause disk-writes.
.... and if one needs a in-memory db we need to ATTACH DATABASE :memory: AS blahblah etc. iiuc. (Read this as the continuation of the previous comment)
Created attachment 125329 [details] [review] Same patch, taken into account the comments on last version Please re-review
Created attachment 125331 [details] [review] Same patch, taken into account the comments on last version, fixed a glitch
Ignore your change in camel-db.h, looks fine otherwise to me. Sankar, is this fine for you too, on the in-memory part?
The in-memory part looks fine to me. srag, didnt you want to rename one column into something else (iirc) ? If so it may be a good time to push that change also , so we touch the db structure only once. Philip, you wanted a way to see if a message is fetched or not, so, a is_cached column will be good for you (instead of flags etc.) ?
An `is_cached` sounds great for that purpose. We can easily adapt this patch to already add that column, and later provide an implementation that updates the table and the row's `is_cached` cell to also be accurate. Note that just checking for the filename, once I have camel_folder_get_filename (see Bug #566279) using fstat is sufficient for my purposes. So `is_cached`, if I have camel_folder_get_filename, is not strictly necessary for me at this point. Let me know which columns I should add.
(In reply to comment #15) > An `is_cached` sounds great for that purpose. We can easily adapt this patch to > already add that column, and later provide an implementation that updates the > table and the row's `is_cached` cell to also be accurate. > > Note that just checking for the filename, once I have camel_folder_get_filename > (see Bug #566279) using fstat is sufficient for my purposes. So `is_cached`, if > I have camel_folder_get_filename, is not strictly necessary for me at this > point. > I have a folder called go-oo, where I am subscribed to the openoffice users mailing list. I get about 200 mails everyday in that. I check 2-3 mails on these folders everyday. Those mails that I have opened I will want them to be tracker-able. So, if you start doing a get_filename for all thousands of infoes it will be too costly a call to index the few tens of mails that I read. If your query for (WHERE IS_CACHED=1) you save a lot of CPU cycles. So, I will strongly recommend you to add this column also. The providers should take care of setting this field on get_message, transfer_messages_to, etc. which can be taken care after the initial patch is made and is not blocking for this patch. Patch is good for commit in my opinion. srag, you dont want to rename any column (msg_security or something ?)
Does this mean that this patch is ready for commit? Or do I wait for srag's review?
msg_security as dirty. Which I would use to sync to the server.
(In reply to comment #17) > Does this mean that this patch is ready for commit? Or do I wait for srag's > review? > I believe the patch is good to commit. but since we are migrating anyway, it is better to accomodate the rename column of msg_security to dirty along with this. so, we have one less coding kludge. Patch looks nice and yes, can be committed.
Created attachment 126257 [details] [review] Final version
Committed as 9920 in eds's trunk