Planeta

  • As promised in some of my previous posts about SeriesFinale, I have finally ported it to GNOME.

    For the ones who don’t know about this pet project of mine, SeriesFinale is a TV shows browser and tracker application that was originally developed for Maemo Fremantle. While I use it all the time in my N900, I have been asked to port it to GNOME and I also thought it’d be a good thing to have it in my favorite desktop.

    The source code for the port can be found in the “gnome” branch of the SeriesFinale project in Gitorious, hopefully I’ll find time to clean the code a bit and prepare Debian and RPM packages. This means that you can try it and install it from source by cloning the git repository, pulling the “gnome” branch and install it “the Python way”:

    # python setup.py install
    (warning for non-Pythonistas: there is no setup.py uninstall but you get to see where the files are copied to by running this command)

    If you find some bugs, you can file them in the Maemo Bugzilla for now (be sure to specify the platform).
    Let’s see if we come up with some sort of synchronization for SF in the future so you don’t have to be marking your episodes twice.
    For now, if you want to start with the SF information you had on your N900, just copy the series.db file under “~/.osso/seriesfinale” in Maemo to “~/.seriesfinale” in GNOME.

    Here is a screencast to show you how it looks like:

    SeriesFinale for GNOME from Joaquim Rocha on Vimeo.

    Hope you enjoy SF on GNOME!

  • The 0.7.1a version of OCRFeeder has been released.

    This version introduces some tasks performed by Emergya as part of the GuadaLinfo Accessible project, such as:
    * Importation from a scanner device.
    * Copying text from the content boxes to the clipboard.
    * Users can now use the typical spell-checker dialog to correct mistakes in the text recognized by the OCR engines.

    Other highlights include:

    * Rewritten ocrfeeder-cli (which also introduces a help method now)
    * Added the automatic detection of the Cuneiform OCR engine
    * Move the OCRFeeder modules to its own folder (so it is better organized and doesn’t conflict with other modules when installing it)

    And some bug fixing:

    * Add the help option to ocrfeeder-cli (gb#630829)
    * Fix selecting all areas
    * Fix ellipsis and title in the queued events dialog
    * Prevent “invisible” boxes creation
    * Remove temporary images for the Tesseract OCR engine

    A big thanks to the great GNOME translators for keeping OCRFeeder available in a number of languages and to Berto for making it available in Debian (which later got into Ubuntu as well).

    Just as I was releasing the 0.7.1a version I realized the spell-checker.ui file was not being installed so I quickly did a tiny release, hence the 0.7.1a and not simply 0.7.1.

    Download OCRFeeder 0.7.1a source tarball

  • The Magical Art of Extracting Meaning From Data (presentation slides in PDF format)

    Case Study Project Code for The Magical Art of Extracting Meaning From Data (zip file with python source code)

    Presentation Video (partial) from sapo videos:

    Related posts:

    1. Codebits 2010
    2. Sapo Codebits 2007 – The Summary Part 1
    3. Bugle

  • StrongInference – Scipy Superpack: “This shell script will install recent 64-bit builds of Numpy (2.0) and Scipy (0.9), as well as PyMC (2.1 beta) for OS X 10.6 (Snow Leopard) on Intel Macintosh. All builds are based on recent development code from each package, which means though some bugs may be fixed and features added, they also may be more unstable than the official releases. I have also included release builds of Matplotlib 1.0 and iPython, rather than the development versions, under the assumption that most users prefer stable versions of these packages. Distributing them together should improve interoperability, since the supporting packages (Scipy, Matplotlib, PyMC) were all built against the accompanying build of Numpy. These packages were compiled on OS X 10.6 using Apple’s Python 2.6.1, FFTW 3.2.2 and GCC 4.2 (build 5646). To avoid compatibility issues, the installer also optionally downloads and installs the gFortran compiler (4.2) built against Snow Leopard’s GCC 4.2 for Xcode 3.2.”

    Best way to install a recent version of NumPy and SciPy on Mac as far as I know.

    Related posts:

    1. One Retarded Thing In Mac OSX – No Easy Shortcuts
    2. Valve Confirms Mac Versions of Steam, Valve Games
    3. Humor – HOWTO tell a mac zealot from a normal mac user

  • The last SeriesFinale version was released before I went to GUADEC and then on vacation which means that it’s been a while since you have had news from this nice little app but today I’m releasing its 0.6.5 version.

    This version has some nice new features apart from regular bug fixing and code improvement.
    Juan has added the portrait mode (borrowed from the great gPodder) which surely pleases many users.

    To control the rotation and other forthcoming preferences, I’ve rewritten the settings class and created a settings dialog:

    Sometimes I get sick of getting the “Special” season on every shows, basically because I never watch those; so, I added a check button to the settings dialog where one can tell if the special seasons are to be considered or ignored, when adding new shows or updating existing ones.

    But, if you’re like me and have a bunch of shows already added, it’d be a pain to delete episode-by-episode from the Special seasons in every show in order to delete these seasons… To solve this and to fill a missing/neglected action, I’ve added the “Delete Seasons” view which makes it easy to delete seasons.

    Some problems with the threads have been solved as well, so, maybe weird issues like missing shows’ full title and stuff will likely be solved after this version.

    Finally, a feature that has been requested a few times has been added: list shows by recent episode date. This means that now there are two filters in the shows’ view that list the shows by most recent episodes or by name. This is really useful because selecting the recent episodes’ sorting you can now update your shows’ list and the ones that got already aired, unwatched episodes will be listed on top of the list.

    Here’s the changelog for this version:

    * Add sorting shows by most recent episode or name
    * Add auto-rotation support
    * Add settings dialog
    * Fix problems with threads
    * Fix episodes highlight when checking/unchecking all episodes
    * Rewrite settings
    * Add special seasons addition preferences
    * Make returning to the shows view faster
    * Add delete seasons view

    Soon, in a Maemo Extras repository near you!

    (Oh, and the next time I touch SF’s code it’s very likely that it will be to port it to GNOME, so, stay tuned…)

  • Alcides Fonseca

    How to install the R python bindings library RPy2 on Mac OS 10.6 Snow Leopard using Homebrew.

    Install R

    In order to install R in your Mac as a framework, make this change in your homebrew/Library/Formula

    brew install r

    Install RPy2

    wget http://pypi.python.org/packages/source/r/rpy2/rpy2-2.1.4.tar.gz#md5=cf4e0d80ba498a6d76f107531966478d
    tar xfz rpy2-2.1.4.tar.gz
    cd rpy2-2.1.4/
    sudo python setup.py build --r-home /usr/local/Cellar/r/2.11.1/R.framework/Resources/ install
    

    Troubleshooting

    If you are having a problem related to “-framework vecLib” when installing rpy2, insert a new line after #134 of setup.py with the following:

    extra_link_args = extra_link_args[:-1]

    Worked for me.

  • I have been hacking on some new and cool features on OCRFeeder for a while and now it is time to show them to the world in a new release.

    These features I’m talking about fall mainly in 2 areas: improving the a11y of the UI and improving the recognition of documents.

    A11y Improvement

    The improvement of the a11y has the typical UI changes to include mnemonics, missing labels and relations, but also other approaches that have more to do with UX like using a progress dialog to inform users that time-taking operations are being carried. This means that now, the PDF importation and OCR won’t block the UI.
    Other changes in this category were the navigation through the content boxes (before, these could only be selected by clicking on them), the selection of all boxes and the deletion of selected boxes.

    The following screenshot shows the box editor area of OCRFeeder with its mnemonics highlighted:

    Box edition area

    Box edition area

    Recognition Improvements

    Sometimes, text columns are so close to each other that they end up being recognized as a single paragraph, so I added a post-detection method to solve this issue. This feature is optional and can be toggled from the Preferences dialog.

    Here’s an example of the difference it makes:

    Before columns' detection improvements

    Before columns' detection improvements

    After columns' detection improvements

    After columns' detection improvements

    Scanned document images are usually skewed and this makes it more difficult for the contents to be successfully detected and “OCRed”. I decided to implement an algorithm to deskew these images. The algorithm uses the Hough transform to try to find lines in the image and their angles and, while it is a bit slow, it works well:

    Skewed image

    Skewed image

    Deskewed image

    Deskewed image

    This action can be used in a loaded image but can also be configured to be automatically performed before the images are added. The Unpaper tool can now also be set to be clean images before adding them.
    This makes it much easier to successfully recognize images obtained from a scanner device.

    Some fine tunning of the content boxes’ bounds was done by trying to shorten their margins, that is, lowering the distance between the boxes and their actual contents.

    The font size recognition was also tweaked to solve the problem of having paragraphs with initials (you know, the huge starting characters) which were influencing the whole paragraphs’ font size.

    To finish the recognition’s improvements, I have added an optional action to find and fix the text’s line breaks. Usually, OCR engines don’t consider “semantic line-breaks”, that is, OCR engines always insert a newline in the end of each line.
    Using some regular expressions, I try to find these “fake” line-breaks and recover the original flow of the text. Like some of the features mentioned above, this one can also be turned on/off from the Preferences dialog.

    Here’s how the Preferences dialog looks like now:

    Preferences_dialog

    Preferences_dialog_recognition

    To finish, images can now be dragged and dropped onto the pages’ area and the mouse wheel can be used to scroll horizontally combining it with the Shift key, thanks to Stefan Löffler, and of course, several bugs were corrected and code was improved.

    As you see, this is a “rich” new version of OCRFeeder that keeps being the easiest way to use OCR in a desktop. You are welcome to file bugs in bugzilla or to send patches and features’ requests to its mailing list or approaching me if you’re in GUADEC.

    Download: OCRFeeder 0.7 tarball on GNOME FTP

  • It’s been a while now since I released the last version of SeriesFinale.
    The truth is that I’ve been busier than usual these days and of course, this is reflected on pet projects.

    As some of you may have experienced, there was a kind of a nasty bug in SeriesFinale’s last version: it wouldn’t update certain shows (when they had been added long ago)… and the good news is that this is one of the things that got fixed in this new version.

    One of the good things Juan introduced for this version is how the next episodes to be watched are shown. Before, the episodes were shown according to their “first aired” date and in case of the same date for two or more episodes, the highest index one would be marked as the one to be watched. In this 0.6.1 version, the episodes are shown according to their number and season, so, if episode #3 has the same air date as episode #4, #3 will always be shown as the next one to be watched.

    It is now using a priority queue to download the series’ covers and info that gives priority to the info. This means you won’t have to wait for the info AND covers to download when you hit the Update All menu but instead wait only for the info; the covers will then be downloaded in the background while you use the app normally.

    I’m now introducing the new Russian translation, which Misha Ketslah had kindly sent to me a while ago but that I hadn’t had the time to integrate.

    Here’s the list of major changes for this new version:

    * Add Russian translation (thanks to Misha Ketslah)
    * Fix updating of shows
    * Use a priority queue to differentiate the downloads of covers or series’ info
    * Use only one AsyncWorker at most to deal with the series
    * Prevent the download and usage of images to generate problems
    * Fix showing next episode
    * Add TheTVDB credits
    * Add THANKS file

    So… what about that GNOME version, you ask? I’ve already started to port it to GNOME but couldn’t dedicate much time to it and taking into account that I’ll be on vacation very soon, it’s likely it will take a little longer. But I’m looking forward to use SeriesFinale on GNOME!

    As for the N900 owners, I’ve just promoted the package to Extras-Testing so either use the Extras-Devel repo as usual or wait ~10 days for it to appear in Extras.

  • Grilo is getting really interesting and one of its newest nice things is the DBUS interface Juan has been working on lately.

    This DBUS interface is currently known as Rygel-Grilo (it was originally intended to be a source for Rygel) and uses the MediaServerSpec to allow developers to retrieve the media objects Grilo provides.

    Since there aren’t still Python bindings for Grilo, I decided to use the Rygel-Grilo to be able to use Grilo from Python.
    So I developed a Rhythmbox plugin that shows every MediaServer1 object available and lets the use browse through the contents of these. Needless to say, although this plugin provides a very generic basic and usage, it’s easy to see how applications like Rhythmbox could be using Grilo to get their media.
    The philosophy is: Grilo gives you content, GStreamer plays that content, and you’re free to focus in the rest of your app’s details.

    Here’s a video of Rygel-Grilo and the Rhythmbox MediaServer1 plugin in action:

    Grilo MediaServer1 Rhythmbox Plugin from Joaquim Rocha on Vimeo.

    You can find this plugin under the MediaServer1 Plugins project on Gitorious.

    Juan did also developed a cool plugin for Totem similar to this one. Take a look at this post to see the plugin working and a more detailed explanation of what Rygel-Grilo is.