Page 1 of 1
Invalid XML being generated?
Posted: Fri Nov 06, 2009 11:13 pm
by dooferlad
Hi,
I am getting this in my XMLTV data file that a parser doesn't like:
Code: Select all
<programme start="20091110170000 +0000" stop="20091110180000 +0000" channel="UK_Digi_88">
<title>Sen in Action</title>
<sub-title>10 Nov 09</sub-title>
<desc>Primary - Tackling Challenging Behaviour 1; &2; Research and Development in SEN - Movement; Secondary - Working with Pupils with Down's Syndrome: an hour featuring a range of SEN strategies.</desc>
<category>Education</category>
</programme>
I think the problem is the &2; that I am guessing isn't a valid escape sequence. That sequence of characters is in the original program description so the grabber just needs to escape it.
Re: Invalid XML being generated?
Posted: Wed Nov 18, 2009 10:51 am
by routerunner
Hi,
I'm having a very similar issue with the XMLTV importer I'm using which doesn't like the " " escape sequence appeared few days ago in the DigiGuide data. This escape sequence is not valid in an XML file and I think the grabber should just skip it.
I now every day manually remove the unwanted escape sequence from the EPG
Regards
Re: Invalid XML being generated?
Posted: Wed Nov 18, 2009 5:17 pm
by routerunner
Hi,
I found a solution (actually two) to our problem:
1) In the grabber tab enable the XMLTV Importer along with the UK_digi, of course with UK_digi being at higher priority. Set the XMLTV importer file to point the same file as the output. What is happening is that after UK_digi has done his job, the XMLTV importer will re-import your out file, cleaning up every "unsupported" markup automagically.
2) I did a bit of LUA study today and I managed to create a new postprocessor script that does the above after UK_digi runs. It works better because the first solution is not bullet proof, in fact I had instances where the XMLTV importer didn't import the whole block of data, so you tend to loose the last programs in the file. I tried to upload the script in attachment here, but the system doesn't allow you to do for .lua files.
Regards
Edo
Re: Invalid XML being generated?
Posted: Wed Nov 18, 2009 5:20 pm
by routerunner
Hi,
I found a solution (actually two) to our problem:
1) In the grabber tab enable the XMLTV Importer along with the UK_digi, of course with UK_digi being at higher priority. Set the XMLTV importer file to point the same file as the output. What is happening is that after UK_digi has done his job, the XMLTV importer will re-import your out file, cleaning up every "unsupported" markup automagically.
2) I did a bit of LUA study today and I managed to create a new postprocessor script that does the above after UK_digi runs. It works better because the first solution is not bullet proof, in fact I had instances where the XMLTV importer didn't import the whole block of data, so you tend to loose the last programs in the file. I tried to upload the script in attachment here, but the system doesn't allow you to do for .lua files.
Regards
Edo
Re: Invalid XML being generated?
Posted: Wed Nov 18, 2009 5:50 pm
by dooferlad
I don't know Lua and didn't have much time so I wrote a python script that throws anything away invalid XML escape sequences. Now I call the grabber through a batch script that calls the XML cleaner script after the XMLTV file is generated. A bit messy, but seems to be effective so far. Clearly a Lua version would be best! Since it wasn't a real fix to the problem I didn't want to post it here because it is a bit of a nasty hack having to run a Python script, but since it is up on the SageTV forums...
http://forums.sagetv.com/forums/showthr ... tcount=823
I really should have given it a license. If anyone wants to hack on it I can put it under Creative Commons non commercial share alike.
Re: Invalid XML being generated?
Posted: Wed Nov 18, 2009 7:50 pm
by routerunner
Hi,
have a go at the solution #1, It does avoid the python script.
Edo