<programme start="20091110170000 +0000" stop="20091110180000 +0000" channel="UK_Digi_88">
<title>Sen in Action</title>
<sub-title>10 Nov 09</sub-title>
<desc>Primary - Tackling Challenging Behaviour 1; &2; Research and Development in SEN - Movement; Secondary - Working with Pupils with Down's Syndrome: an hour featuring a range of SEN strategies.</desc>
<category>Education</category>
</programme>
I think the problem is the &2; that I am guessing isn't a valid escape sequence. That sequence of characters is in the original program description so the grabber just needs to escape it.
I'm having a very similar issue with the XMLTV importer I'm using which doesn't like the " " escape sequence appeared few days ago in the DigiGuide data. This escape sequence is not valid in an XML file and I think the grabber should just skip it.
I now every day manually remove the unwanted escape sequence from the EPG
1) In the grabber tab enable the XMLTV Importer along with the UK_digi, of course with UK_digi being at higher priority. Set the XMLTV importer file to point the same file as the output. What is happening is that after UK_digi has done his job, the XMLTV importer will re-import your out file, cleaning up every "unsupported" markup automagically.
2) I did a bit of LUA study today and I managed to create a new postprocessor script that does the above after UK_digi runs. It works better because the first solution is not bullet proof, in fact I had instances where the XMLTV importer didn't import the whole block of data, so you tend to loose the last programs in the file. I tried to upload the script in attachment here, but the system doesn't allow you to do for .lua files.
1) In the grabber tab enable the XMLTV Importer along with the UK_digi, of course with UK_digi being at higher priority. Set the XMLTV importer file to point the same file as the output. What is happening is that after UK_digi has done his job, the XMLTV importer will re-import your out file, cleaning up every "unsupported" markup automagically.
2) I did a bit of LUA study today and I managed to create a new postprocessor script that does the above after UK_digi runs. It works better because the first solution is not bullet proof, in fact I had instances where the XMLTV importer didn't import the whole block of data, so you tend to loose the last programs in the file. I tried to upload the script in attachment here, but the system doesn't allow you to do for .lua files.
I don't know Lua and didn't have much time so I wrote a python script that throws anything away invalid XML escape sequences. Now I call the grabber through a batch script that calls the XML cleaner script after the XMLTV file is generated. A bit messy, but seems to be effective so far. Clearly a Lua version would be best! Since it wasn't a real fix to the problem I didn't want to post it here because it is a bit of a nasty hack having to run a Python script, but since it is up on the SageTV forums...