Sourceforge to Roundup Converters ================================= This README describes the sourceforge_ to roundup_ converters available via svn from http://svn.python.org/projects/tracker/importer/. The converters were written with the aim of providing the python project with their own bugtracker based on roundup_ to replace the sourceforge bugtracker that were used earlier. The code in the converters are specific to roundup and to the roundup instanse developed for the python project, but it can probably serve as a base and inspiration for other projects that need to process sourceforge data. Three Converters ---------------- Three different converters are available. The first screenscrapes sourceforge to get the data. This version will only be briefly documented. The second parses the format output by sourceforge's "old" backup system, known as "The XML Data Export". This export is available to all project administrators, and is found on the following URL: https://sourceforge.net/export/xml_export.php?group_id= The third converter parses the format output by sourceforge's "new" backup system, known as "an enhanced version of our XML Data Export facility". This exporter is only available to users that are both project administrator *and* subscribers, as the export is in "early release feature" mode. This second export is available under https://sourceforge.net/export/xml_export2.php?group_id= Due to the size of the python project, we had to use the "enhanced" version (xml_export2.php), as the version available to all projects produced invalid XML with missing data. If the export from the standard version does not validate as XML due to a missing and a missing , you have hit the same bug and need the enhanced version, which is at the time of writing available only to subscribers. Why Three Different Converters? ------------------------------- To make a long story short: * When we began with the project of replacing sourceforge's tracker with one based on roundup, the XML Data Export was completely broken. So, Fredrik Lundh provided a screenscraping framework for sourceforge, and a converter based on this was written. This converter took about 15 hours to complete, and was very error prone due to the instability of the sourceforge web servers. * The XML Data Export then got bug fixed, and a new converter were written that processed the XML instead of doing screenscraping. Much faster (about 2h), and also more reliable. * The XML Data Export then began to malfunction, producing invalid XML. This problem was reported to sourceforge. * After a while, sourceforge told us about the enhanced data export, which unfortunately not only were available only to subscribers, but also produced a completely different XML format. A third converter were written. The Screenscraping Converter ---------------------------- The following files in this directory is needed by the screenscraping converter: * effbot2roundup.py * handlers.py The Screenscraping converter also requires effbot's sourceforge screenscraper from http://effbot.org/zone/sandbox-sourceforge.htm The Standard "XML Data Export" Converter ---------------------------------------- The converter for the standard export available to all project administrators consists of the following files: * BeautifulSoup.py * sfxml2roundup.py * sfxmlhandlers.py Basic usage is to run 'sfxml2roundup.py --xmlfile --trackerhome '. The handlers list in sfxml2roundup.py needs to adjusted to suit your roundup instance's schema, and the handlers in sfxmlhandlers.py probably also need adjustment. The "enhanced XML Data Export" Converter ---------------------------------------- The converter for the enhanced XML Data Export available to subscribed project members consists of the following files: * config.py * xmlexport2handlers.py * xmlexport2toroundup.py Basic usage is to run xmlexport2toroundup.py --xmlfile --trackerhome '. The handlers list in xmlexport2toroundup.py and the handlers in xmlexport2handlers.py need to be adjusted to your roundup instance's schema. config.py contains some mappings between sourceforge's values for properties of different kinds and the corresponding properties in your roundup instance. Other Important Utilities ------------------------- The fixsfmojibake.py script takes care of the fact that the export from sourceforge has mixed-up character encodings. The export claims to be iso-8859-1 but also contains UTF-8. This has only been tested with the export from the enhanced version, but if you experience character set trouble with the export from the standard version, this script might help. Usage: fixsfmojibake.py < in.xml > out.xml Getting more help ----------------- Subscribe to and mail the tracker-discuss mailing list to get hold of the people that wrote the importer. http://mail.python.org/mailman/listinfo/tracker-discuss has the details. .. _sourceforge: http://sf.net .. _roundup: http://roundup.sf.net