GNOME Bugzilla – Bug 467911
[subparse] sami parser update
Last modified: 2008-05-05 11:13:54 UTC
Please describe the problem: There is two changes. 1. process content data that surrounded with 'sync' tag only. 2. tag compare until NULL. After I updated gutsy, sami parser could't skip SAMI header data, and just render. I found HTMLParser in libxml2 has some changes and don't emit about "title" tag any more. So I update sami parser to process content data that surrounded with "sync" tag only. And usage of xmlStrncmp, it didn't check end of string(until NULL), so it can make some bug. Steps to reproduce: Actual results: Expected results: Does this happen every time? Other information:
Created attachment 93890 [details] [review] sami parser update
How odd. This is not a bug in libxml2 then? Could you add such a SAMI file so I can write a unit test?
Ping?
Created attachment 97111 [details] test source for check libxml behavier
Sorry for delayed reply. I couldn't check mailbox for a while. I made a test source for libxml2 behavier between 2.6.27(feisty) and 2.6.30(gutsy). and I found they have different behavier. First of all, both emit 'title' tag. I had a misunderstand. but 2.6.30 emit css contents in "style" tag(This is that I mentioned about "sami parser could't skip SAMI header data" in a report). 2.6.27 didn't. So, a patch that I attached can resolve this bug. I'll attach test source and result.
Created attachment 97112 [details] result of sample code against libxml2 2.6.27
Created attachment 97113 [details] result of sample code against libxml2 2.6.30
An example file would be at http://svn.annodex.net/scripts/trunk/subtitles/sample-sami.smi Working on that now...
Something similar to your patch is now committed... thanks :) 2008-05-05 Sebastian Dröge <slomo@circular-chaos.org> Patch by: Young-Ho Cha <ganadist at chollian dot net> * gst/subparse/samiparse.c: (handle_start_sync), (start_sami_element), (end_sami_element), (characters_sami), (sami_context_reset): Only output characters inside the "sync" elements. There could be other elements like "style" that have some content but should not be printed. Fixes bug #467911.