Custom URL support for Upstream Tracker Project
One of the features available on the Upstream Tracker project is the ability to process custom URLs to track upstream package versions. This allows URLs to contain regular expressions which are processed using the re module in python. To demonstrate this feature, let me post a few sample inputs and outputs that were generated using this module.
Custom URL : http://coherence.beebits.net/download/Coherence-(.*)\.tar\.gz Coherence-0.6.6.2.tar.gz http://coherence.beebits.net/Coherence-0.6.6.2.tar.gz Custom URL : ftp://ftp.gimp.org/pub/gimp/help/gimp-help-(.*)\.tar\.gz gimp-help-2-0.11.tar.gz ftp://ftp.gimp.org/gimp-help-2-0.11.tar.gz Custom URL : http://code.google.com/p/giver/downloads/list/giver-(.*)\.tar\.gz giver-0.1.8.tar.gz http://code.google.com///giver.googlecode.com/files/giver-0.1.8.tar.gz Custom URL : http://sourceforge.net/api/file/index/project-name/libyui/rss/libyui-(.*)\.tar\.gz libyui-gtk-2.21.96.tar.gz http://sourceforge.net/projects/libyui/files/libyui-gtk-2.21.96.tar.gz Custom URL : https://launchpad.net/dee/+download/dee-(.*)\.tar\.gz dee-1.0.10.tar.gz https://launchpad.net/https://launchpad.net/dee/1.0/1.0.10/+download/dee-1.0.10.tar.gz Custom URL : https://launchpad.net/dee/+download/dee-0\.(.*)\.tar\.gz dee-0.5.22.tar.gz https://launchpad.net/https://launchpad.net/dee/0.5/0.5.22/+download/dee-0.5.22.tar.gz
From the above examples, we can see that the module supports projects hosted on sourceforge as well as projects where the download pages are accessible via HTTP and FTP. It is also possible to narrow down and search for only 0.x branch of a package using suitable regex as shown in the last example. The module tries to implement complete regex support and does this with the help of the python re module. However, to identify if a term has to be evaluated using re, we look for parenthesis. Thus, regular URLs with parenthesis would be treated as regex terms and would be processed accordingly. For example, consider the URL below :
Here, we list all the links present at https://launchpad.net/dee/+download/ and apply the regex dee-0\.(.*)\.tar\.gz on each of them. The links that conform to the format are processed to obtain the latest version and the location or the absolute URL of the tarball/source code. Here, dee-0\.(.*)\.tar\.gz is considered as the regex term as it contains parenthesis. Hence, currently, all regex terms should contain parenthesis ‘( )’ and should NOT contain ‘/’.
As work continues, there would be ways to escape the parenthesis which would enable users to work with URLs which contain parenthesis.