Entradas no planeta de jrocha


  • OCRFeeder 0.8.1

    Por jrocha, em 22 Dezembro 2014 08:38 - Mais entradas deste utilizador

    Taking advantage of the holidays, I have been dedicating some time to my side projects so today I am giving you OCRFeeder version 0.8.1!

    The last OCRFeeder version had a very important change which was the port to GObject introspection and I was already expecting a few bugs to pop up here and there. That proved to be true and so this version is mainly about bug fixing.
    Specifically there was an issue related to GDK’s threads which caused the application to abort. Besides that, exporting a document or saving/loading a project was not working correctly due to unicode issues (because Python is very nice but working with unicode is sometimes more annoying than it should be, at least in versions prior to Python 3).
    Anyway, all that should be working correctly now!

    Besides squashing bugs, I also made some long due changes: made the Preferences dialog smaller (by adding its contents to a scrolled window) and migrated the application and engines’ settings to the XDG user configuration folder as opposed to .ocrfeeder.
    Yes, I know that I should be using GSettings for the application’s settings by now but there were more critical changes to be done.
    Besides a small change in the widgets that set a box’s type (from a radio button style to a non-indicator, grouped pair of buttons), there are no other UI changes but I really like how much more polished OCRFeeder seems with the nice recent GTK+ styles.

    ocrfeeder-0.8.1-screenshot

    Future

    I have a number of ideas to make the application better not only in terms of UI/UX but also in terms of features. The detection algorithm hasn’t been touched for years and I am sure it can be improved not only in terms of performance but also in terms of accuracy.
    One cool feature I’d love to see implemented is to have a quick way of translating a document’s contents. This would be helpful e.g. to users living abroad who might need to translate letters to a language they speak.
    Nonetheless, as mentioned in my previous post about OCRFeeder, it is indeed not easy to find the time and motivation to dedicate to the project these days with all the work, life and other side projects so I don’t know when I will have time for it again. In that regard, if you want to give me a hand, you’d make me very happy as there is a lot of work to be done.

    Happy holidays everyone!

    Source tarball
    Git
    Bugzilla

  • After a long time without a new release, OCRFeeder 0.8 is out! The previous version was released in February 2013 from another continent 🙂 After that a lot of things happened in my life (very good ones) and I didn’t really have much time to devote to the project.

    What’s up?

    This version represents one big change: it was ported to GObject Introspection (and thus GTK+ 3)!
    This is also related to the delay (because GooCanvas’s GI, a dependency, was not usable in the beginning). Also, after the port started, a few things were deprecated in GTK+ — like Stock items — but this will only be updated on a future release.

    I didn’t want many new features in this version as I wanted it to be basically about the port to GI. This way, “eventual” bugs are likely to be about this change and not about unstable new features. I included a small novelty however: support for multi-page TIFF images.
    There are, of course, some other small improvements that were developed, as well as a number of bugs that were fixed.

    Future

    Work, life and other projects make it more and more difficult to find the time to work on OCRFeeder. I would nonetheless be happy to help anyone interested in contributing to it to give the first steps. I believe that OCRFeeder is a useful project and not only for accessibility purposes (although this is a great reason on its own!) so, if you like Python, GTK+, and want to help make this project better, drop me an email.

    I need to thank one more time to the awesome GNOME i18n team for keeping OCRFeeder available in many languages and to my dear friend Berto for keeping the Debian package up to date and for the useful bug reports!

    Source tarball
    Git
    Bugzilla

  • I have a PlayStation 3 and I love working with new types of user input so, as my last hack of the year, I wanted to use the Leap Motion Controller to play some game on the PS3.
    The Leap Motion Controller is obviously not compatible with the PS3 so the plan was to use a regular computer, interpret the gestures from the Leap Motion, and send the respective controls to the console.

    For the game, I chose GTA V because it involves many different actions such as running, jumping, driving or shooting… and it’s awesome!.

    Here is the video of yours truly using this script to do some disastrous driving but having a lot of fun with the Leap Motion and GTA V:

    The reason why the big video has such a low quality and the tiny one is fine is that they were recorded with my Nexus 5 and my Canon S95, respectively, and my living room was very dark.

    How it works

    As seen in the video, it is also possible to control the PS3 menu and choose the game from there. The player’s actions I chose to implement were walking, running, jumping, driving and enter/leaving a vehicle. All of those were easy to implement except for the driving. The thing is that I can easily get the angle for the imaginary steering wheel that a user does with the Leap Motion device but I could only simulate turning the left analog stick fully to the left or to the right. This makes it kind of difficult to steer a car, as can be seen in the video, but it’s still fun to do it.

    For the communication with the PS3, it uses the GIMX project which makes it possible to simulate a SixAxis game pad from a computer and send its actions over bluetooth to the PS3. GIMX has some nice utilities, being its main one the emuclient which detects key events and uses a configuration file to map them to the actions of the SixAxis. It would be much more elegant to send the commands to the PS3 directly from the script I wrote but it was simply faster to instead simulate the key events and let GIMX do the rest with the right configuration file.

    As with the Leap GNOME Controller, this is a small script rather than a big project. To know how to use the project, please refer to the README file that ships with it. Hopefully someone will like to try it out and improve the current gestures or make new ones.

    Get the source at GitHub and have a great 2014!

  • When I explained how the Leap Motion device could be used on Fedora 19, I mentioned how I had one of those early prototypes. Well, Leap Motion was extremely kind and sent me an actual device as a thank you for starting the thread asking for Linux support. Now that GUADEC is over and I am spending my vacation in Portugal, I had a little time to play with my fancy new device and wrote a relatively small script to control GNOME with it. I call it the über original name of: Leap GNOME Controller!

    For those who don’t care about technical details, here’s the video showing what can be done with Leap, GNOME and this script. Technical details follow below the video:

    The two videos that compose the one above were recorded with an HD camera and GNOME Shell’s screencast recorder. I tried to sync them the best I could but a certain delay can be noticed, especially at the end of the video.

    The code

    Leap Motion provides a “close source” shared library and a high-level API with the respective documentation for the many bindings it has. To code it quickly, I used the Python bindings and Xlib to fake the input events.

    Leap Motion’s APIs make it really easy for one to simulate a touch-screen. It even offers a “screen tap” gesture that should be the obvious choice when mapping a finger touch/tap to a mouse click. However, this didn’t work very well. The problem is that if we are tracking one finger to control the mouse movement, when performing the “screen tap” gesture, the finger (and mouse) will of course move. Making it as frustrating as seen on ArsTechnica hands-on video.

    I came up with a solution for this by dropping the “screen tap” gesture and using the “key tap” instead. The “key tap” is a simple, quick down-and-up finger movement, like pressing a key. This is much more precise and easier for a user to do than the “screen tap”. Of course that when the finger moves for performing the gesture, the mouse pointer would move as well, so I came up with a little trick to work around this: when the mouse pointer doesn’t move more than a couple of pixels for half a second, it will stop and only move again if the user makes it move for more than 150 pixels. This allows for the user to stop the pointer with precision where it needs to be and perform the gesture without making the pointer move.

    Future

    The Leap device offers a lot of possibilities for adding many gestures. Ideally they should be implemented per application but being able to control the shell is already pretty useful, so it would be wonderful to fine-tune the current gestures and add new ones. I also wish the library’s source code were open because I ran into small issues and I wish I could take a look at the source code, instead of trying to fix it based on the theories of what might be wrong.

    I haven’t explored the AirSpace appstore yet so I don’t know if it is worth adding (or possible to add) this script there but I will check it out.

    Have fun with Leap and GNOME!

  • Here is 2013’s first version of OCRFeeder, version 0.7.11.

    For this version, a number of bugs were fixed, especially some that were affecting saving and loading projects.
    Some small improvements were also made such as being able to load multiple images at once and being able to choose the OCR engine from the command line interface version of OCRFeeder (using the -e option).

    Now for the main feature, I developed something that had been requested by a good number of users: being able to easily choose the language for the OCR engine.
    When I developed OCRFeeder, I wanted to make it easy for users to use system-wide OCR engines from the layout analysis that OCRFeeder performs but I also wanted it to remain powerful and that’s why the engines are configured in a general, abstract way, as if from the command line.
    Some OCR engines support setting the language in order to get a better recognition and while, users could already set the language of an engine manually using the OCR editor dialog, they wanted to have a nice drop-down list with the languages instead.
    This represented a real challenge: to keep the old and flexible configuration and, at the same time, offer a high-level way of choosing the language.

    OCRFeeder's new configuration
    So here is how it works. There is a new special argument keyword $LANG that will be replaced by the new field “language argument” and the currently set language. Since engines support different languages (or none) and call them different names (e.g. Tesseract expects “por” for the Portuguese, others may expect “pt”) there is another new field called “languages” which should be a map between the language code in the ISO 639-1 and the name of the language of the engine expects, as shown in the screenshot.

    Languages combo
    To show the languages, there is a new tab in the areas’ editor called Misc (in lack of a better name for a tab that’s holding more stuff in the future) with the languages combo. This combo shows a check on the languages that the currently selected engine recognizes as seen in the screenshot.

    There is also a new setting in the preferences dialog with the default language and the first time the application runs, it will assign it to the user’s locale.
    One thing must be taken into account: even though Tesseract supports an extensive list of languages, the users must have those packages installed in their distros, otherwise, recognition will of course fail.

    To finish, related to my recent job search, I have spent this week in San Francisco getting to know some people from an exciting start-up and despite the jet lag, I managed to finish this release so I can now say that least part of OCRfeeder was designed and developed in California 😛

    Source tarball
    Git
    Bugzilla

  • Winds of Change

    Por jrocha, em 14 Janeiro 2013 10:12 - Mais entradas deste utilizador

    In my previous post I mentioned that 2013 would be a year of change. Well, here is the moment to say why that will be so: I have quit Igalia.

    Igalia is a very special company to me, I joined it in December 2008. These were 4 intense years where I saw how the company evolved, how it moved to a cool new office, how it grew and I learned a lot in there. I had the chance to participate in several important projects like Maemo or Meego and also to create others. I could even tell the world about them in the many conferences I spoke at and I am also proud to have accomplished things such as putting the company’s name for the first time in the highlights of online media like ArsTechnica.

    So the question people always ask is: why did I leave!?
    As some of you may know, Igalia is organized in a flat structure where we take more responsibilities than just coding and the ultimate part of a career in the company is to become a partner. I knew this when I joined and I think this is a wonderful thing. Being at the end of my 4th year, the next stage would be to become a partner, however, for a while now I have been feeling the need of a change, of trying something different. I take my responsibilities seriously so joining as a partner would 1) only perpetuate these feelings and 2) not be fair to my colleagues. This and other factors led me to make the very difficult decision of leaving.

    The future

    My wife and I moved to A Coruña (Galicia, Spain) shortly after I joined Igalia. We like the city and its people but moving is part of that change I was talking about and the truth is that we were only here for Igalia in the first place. (I will probably write a few more words about this beautiful city when we actually leave)
    The most difficult part of it is definitely leaving our friends. We met very nice people during these 4 years in Coruña and we consider some of them good friends rather than simply coworkers. But life is like this and I am sure we’ll stay in touch.
    On the other hand, the good thing of working in a Free Software company is that you can keep contributing to the projects you worked on in there if you want, so I hope I will keep doing that.

    Since I have only started looking for a new job after I notified Igalia of my decision, I still do not know where we will move to but we are open to many places.

    If you are interested in what I can do for your project or company, be sure to contact me through email or LinkedIn so I can send you my CV.

    That is all. I am already in touch with some companies so wish me luck!

  • I spent the first half of this week in the beautiful city of Évora, where I was born. The occasion was the Semana da Ciência e Técnologia (Science and Technology Week) of the University of Évora to which I was invited.
    I also ended up giving the organization a hand by asking Thomas Perl (the restless mind behind gPodder) and Lucas Rocha (well known GNOME developer now using his powers in Mozilla) who kindly accepted.

    Having participated in the organization of events during the University, I’m always happy to see these initiatives taking place.
    It was also great to spend a couple of days with the folks at my University and meet with old friends.

    About the talks, Thomas gave an overview of gPodder and the infrastructure used to manage the project. Lucas gave a really nice talk about what Mozilla is, what it does and why you should care; because of it, I ended up installing Firefox Mobile nightly build for Android and it has improved a LOT.
    My friend Luís Rodrigues (no blog because he’s a badass) talked about CERN, where he works. What an amazing place! He talked about how much CERN uses Python and Django to manage their data. As a Python lover, this makes me really happy.

    This was also the first time I presented Skeltrack, my latest creation inside Igalia. Presenting such an algorithm is not an easy job so I took mental notes about what to improve the next time (which will be at LinuxTag) but I was happy that people made good questions about it.

    I’d like to thank to the AAUE (Students Association) for the great time we all spent in there.

    Presentation slides :

  • That’s right, a couple of weeks ago new versions of SeriesFinale were released.
    There was a long absence between these and the previous releases. The truth is that it has become more and more difficult for me to find the motivation (and time) to do work on an application for platforms I am not currently using. Still, I have had some emails from people showing their appreciation and Juan has also helped a lot (he is the reason there is also a new N900 release).

    If you’re following the development of SeriesFinale, I have recently moved the repository over to GitHub (like I did for most of my projects). GitHub is so much faster than Gitorious and has nice features such as an issue tracker. Before you say it, although GitHub is not Open Source software, we’re talking about a hosted solution for Git repositories from a very cool company and I had no intentions of hosting Gitorious on my own anyway.

    So what’s new in SeriesFinale? I need to differentiate between the platforms’ versions first.
    Harmattan (N9) is on the 0.6.9 version and many bugs were solved like:
    * Marking all episodes from the episodes’ list menu (nd#1)
    * Episodes’ overview height (nd#9)
    * Updating the shows season list
    * Add a close button to show info dialog
    * Add mark none action to the episodes’ list menu

    There are still some issues when scrolling the lists which I’ve looked into and could not find any solution, I am convinced it actually has to do with the Python bindings of QML…

    Fremantle (N900) is on version 0.6.10 and has less visible changes but the threads, languages and sorting functions were improved.

    Adding the the new Harmattan version to the Nokia Store was also a challenge (it kept being rejected due to tiny details) but it eventually went through.

    Be sure to test and vote for SF on Fremantle, or, in case you have an N9, get the new version from the Nokia Store:

    Get SeriesFinale from Ovi Store

  • Last weekend I gave my annual Django workshop for this year’s students of the Free Software Master that Igalia organizes.

    When I started with Django it was 2007 and I was happy with its version 0.96 :)
    Currently with its version 1.4, the path that it took and the improvements it has got are incredible and it is used by many interesting companies and organizations.

    My presentation for this year’s workshop covers more things and is based on Django 1.4.
    You can check it out below, the license is Creative Commons as usual:


    (direct link to presentation in SpeakerDeck)

  • That’s right, this year GUADEC is taking place in the city I moved to more than 3 years ago in order to become an Igalian: A Coruña. It’s fun to see this event happening just 20 minutes walking from my place when in the previous editions I had to catch several planes in order to attend it :)

    Going to GUADEC

    In this year’s GUADEC, I am presenting two projects I have created:

    • OCRFeeder, the most complete OCR Free Software solution;
    • Skeltrack, the first Free Software library to perform human skeleton tracking from depth buffers such as the ones given by the Kinect.

    If this sounds interesting, be sure to attend the talks or have a chat about the projects when you see me.

    Since I feel pretty much like a local, I can tell you that you must not leave the town without trying “pulpo á feira” (octopus + olive oil + paprika) or, in case you’re not into cephalopods, just go to some traditional Galician bar, have a beer and enjoy the folk music of Celtic origins.