May 31, 2008

SOC Report: Week 1

Well the first week of summer of code has finished. This week I spend my time evaluating and testing the various options available to (semi)automatically wrap C code (libsyncml) so that it is accessible from Python. My priorities when evaluating the options go something like,

  1. Capability - the tool should be able to (semi) automatically wrap a large majority of the libsyncml api. Any customizations required in order to make the wrapping more complete should be readable and maintainable by people other than myself.

  2. Documentation availablitly. Follows from #1, can I actually learn and use the tool within the SOC duration.

  3. The wrapping tool is actively developed.

  4. Does not introduce additional runtime dependencies other than the library being wrapped.

  5. Minimal compile time dependencies when creating the bindings.

  6. Community service value (i.e does the selection and use of the tool bring a positive benefit to the FOSS ecosystem greater than the actual library being wrapped).

The following is a list of available options I looked at (see cython for more explanation)

  • Pyrex Produce very nice and clean C file, which you just compile to .so and that's it. Allows to wrap almost any C and C++ code. IDL is python-ish.

  • Cython The same as Pyrex, but some new nice features added

  • SWIG The defacto standard I guess. SWIG is one of the oldest and most mature methods of wrapping C or C++ code into Python (SWIG works for other target languages as well). SWIG produces a C file from an IDL, which gets compiled to a .so, but then it also produces a Python wrapper on top of this. Because Python wrappers are written for you, if their design is not exactly what you want, you end up doing more work to create your final Python API.

  • SIP Similar to SWIG, but only aimed at wrapping C and C++ to Python. Unlike SWIG there is no Python wrapper. Used by PyQT and PyKDE.

  • Boost.Python Writes C++. Not evaluated due to the additional dependencies required.

  • Ctypes Ctypes is included standard in Python 2.5. The IDL is typically a python class hiding the ctypes calls, making the API more pythonic. It allows one to call library functions defined in shared object libraries directory from interpreted Python code.

  • Py++ It generates Boost.Python wrappers. Not evaluated.

  • f2py It's mostly for wrapping fortran files, but it can also wrap C files, even though it's not a very well-known feature. Not evaluated

  • PyD This works like boost.python, but for the D language. Not evaluated.

  • Interrogate This works similar to SWIG. It created dynamic link libraries that can be used both from python and c++ via the Python C API. No other files are needed. Its not very well documented but is used in several commercial mmorpg's and is native to the Panda3d engine. Not evaluated.

  • Robin Insufficient documentation to evaluate.Similar approact to swig, sans the intermediate IDL.

  • PyBindgen The IDL is itself python, and it generates clean readable dependency free C code. Designed for wrapping C++, but has some support for wrapping C libs.

  • pygobject (codegen.py and h2defs.py) The Gobject way, and the way I am most familiar. Unfortunately, in order to wrap the libsyncml library I would first need to wrape it in GObject.

Conclusions

The libsyncml library uses the Gobject mainloop, and custom error types. In order to integrate this with pygtk applications It would need to link to Pygobject/C, and propogate the error types to exceptions.

Somewhat unsurprisingly, the weak point in almost all of these approaches is there documentation. While I like the look of PyBindgen, it is a nightmare to build, and docs are sparse. The SWIG IDL is hairy, and one must also maintain pythonic wrappers to make a nice library. Pyrex and friends do not seem suited to the integration of libsyncml and pygobject without additional C glue

At this stage I am leaning towards SWIG, for community service value (others can come along after and make C# wrappers for instance), its availability of documentation, and even if the IDL is quirky, others are familiar with it.

Distributed Version Control Systems and visibility of development

My opinion on the 'best' DVCS is not relevant. What I am concerned about is that if GNOME does not pick one, and/or provide some sort of hosting or method to track other peoples development branches then the visible activity level, and subsequently health of the whole project will suffer.

The premise here is that centralized version control systems make it easy to follow what developers are working one, and the activity level of development, via the svn-commits mailing list for example.

I can only offer anecdotal evidence here, but I think that the visibility a projects development is just as important as the actual rate of development being done.

  • If developers cannot see what other people are hacking on, then there is the potential for duplication of work, or conflicting implementations.

  • If users do not see people actually doing work, then there is a tendency to assume the project is 'abandoned' or dead. The only thing worst than a 'dead' project is being proclaimed as such when one is not.

I consider the plethora of ways one can follow what developers are doing part of the problem, not part of the solution.

Who has time to follow planet, IRC, github, repo.or.cz, freedesktop git, launchpad.net bzr, mailing lists, twitter, $COMPANY gitweb, $PERSONAL gitweb, $DISTRO viewvc and gnome.org/$USER_HOME_DIRĀ  to see what people are working on.

This post is not meant to be Reductio ad absurdum, its just a slight generalization of why I read planet.gnome.org/svn-commits, etc, etc.

  • Part of the reason is to see what other hackers are up to.

  • Part is academic, to learn techniques and design from some of the great hackers on here.

  • Part is flagrant procrastination.

  • The small remaining part is the keep the voice in my head thats says "you should be using KDE, it appears to be more actively developed" at bay.

Conclusions, if any;

  • planet.ubuntu.com seems to have excellent visibility of active development, even if it doesn't have as many developers as other distributions.

  • freedesktop (via planet.freedesktop.org and http://gitweb.freedesktop.org/) seems to have excellent visibility of development (many people put git branches in their home directories, which are subsequently picked up by gitweb).

  • I am not advocating activity over productivity (obviously we are all free to use the tools which allow us to be the most productive, not just appear the most active). I just think that public FOSS development is an interesting space, in many ways the developers of the products are the marketers of it.

  • GNOME used to have the balance of visibility about right, but I think we are losing that with all this dilution.

Change scares me. That is all.

Apr 1, 2008

More Conduit GSOC Ideas

I see that Google has extended the SOC application deadline. Here are some Conduit related SOC ideas for GNOME.

  • Port Tomboy sync to use Conduit (and get free support for $WEBSERVICES) Use Conduits DBus interface and our C# bindings to said interface to be able to configure and initiate synchronization from Tomboy. This means that peer-to-peer Tomboy sync will get easier (no more ssh fuse), and that support for additional websites/mobile devices will become available.

  • Port F-Spot photo export to use Conduit With the exception of mentaloo gallery, we support most/all of what F-spot can export to. I can count a number of times where F-spot has hung/crashed during photo export. Using Conduit to do the sync will help prevent this, and ensure that if something untoward does happen, when you restart the sync duplicate photos will not be uploaded.

  • Mobile phone support

    • Integration with gnome-phone-manager for phone discovery and/or pim data Get/Set

    • Use of python-gammu for fetching PIM data, and photos from those devices that do not support obex-ftp

  • Gstreamer based media transcoding We currently call FFMPEG or mencoder via command line to convert/scale video and audio files. I would like to use gstreamer. A possible solution to this would be to create a gstremer transcoding utility (GUI and command line), or the use of the python gstreamer bindings from within Conduit. While I prefer the latter, the former would be a useful addition to the GNOME desktop.

  • Port cheese to use Conduit for photo and video site upload This would be based upon out glib dbus bindings.

  • More GNOME plugins using out DBus interface

    • (finish) Eog plugin for photo site upload

    • Add video upload to Youtube and Vimeo from Totem

    • Better nautilus integration. Removable volume support has improved in Conduit. It would be good to expose this from nautilus, although I am not sure the role this would take, for example

  • Support windows mobile devices. We have preliminary support for SyncCE, but this was never completed. There are capable python bindings to SyncCE, so it would take a hacker with a WM5/WM6 device to finish this.

  • Support palm pilots. Once again, not something I can work on as I do not have a device. There are some python bindings for getting data from Palm devices, and there is also the possibility of wrapping the GNOME pilot code to enable it to be used from Python.

Mar 13, 2008

Summer of Code

Work on Conduit is progressing nicely, squashing bugs that have appeared when interacting with the new versions of GNOME applications. I have kind of been tracking the GNOME release schedule (congratulations on the latest release BTW), so I expect to put out another Conduit release this week.

I intend to keep making stable bug fix releases until there are Python GIO bindings. At that time it would probably make sense to branch so that I can begin targeting 2.24, and have the opportunity to land some more invasive changes.

The current bugs that are being hacked on (i.e. have patches), depending on how the implementation turns out, may appear in $NEXT_STABLE_RELEASE, or in $FUTURE_RELEASE.

Summer of Code Projects I create this page on the GNOME wiki which includes some summer of code ideas for Conduit (and by extension, synchronization in the GNOME desktop).

Disclaimers:

  1. I am eligible, and will be applying to participate in summer of code this year.

  2. I wont be offended if people apply to do the same tasks that I am going to be proposing. The more the merrier!

  3. Im not sure how the whole mentoring regulations work. Can I mentor and be a student?, are there mentors in the GNOME community who would like to champion Conduit integration in their application?

Mar 29, 2007

Conduit Updates

Its been a while since I blogged about Conduit, and a lot has been going on. My main task has been working on the conflict resolution UI and on tweaking the conversions between datatypes so as to maintain maximum fidelity. John Carr, the other main Conduit contributor has also been working on some awesome stuff.

Conflict Resolution

During the synchronization process a conflict can occur in any of the following scenarios

  1. In a one way synchronization, if the destination data is newer that the source data (the exported data shoud not clobber user updated data case)

  2. In a two way sync when the destination data has been modified since the last synchronization (the two computers sync to the same location but both have had source changes case)

  3. In any situation when (for backend specific reasons) a comparison between data is unable to determine which is newer than the other.

  4. A piece of source data has been deleted and the user has selected a synchronization policy in effect which states they want to be certain before deleting the last remaining copy of that data at the destination. (I handle delete as a special case of a conflict)

Dataprovider backends communicate conflicts to the UI via signals. The UI has access to the conflicting data so can offer the user the ability to compare which is newer. Conflicts use the same arrow metaphor that the main UI uses and can either be resolved in the main window, or be shown with more detail in their own window.

ConduitConflict.pngConduitConflictWindow.png

If the data has set a (gnome-open'able) URI then the user can view the data at this location and decide which is newer. For example, a user is synchronizing his Tomboy notes via an iPod. If there is a conflict in this scenario then clicking Compare will cause Tomboy to show the local note and will launch a text editor to inspect the conflicting note off the iPod. (Lazyweb request: Any idea on how I might arrange the windows side-by-side via libwnck for example).

Fidelity

The main difficulty with synchronizing different datatypes, or the same datatypes but via a middleman is maintaining fidelity (that is not losing any information) during the whole process. In the case of contact or calendar data then the default would be to support just standard datatypes (vcard, ical, etc).

  • Opensync 0.30 is looking to define their own XML schema for basic datatypes, and I would support this idea and likely adopt the same schema.

  • Apple .mac uses strict datatypes and pushes everything to a central server. One alternative to maintaining fidelity is to keep data in all formats but maintain a single canonical source of modification times, and perform conversions (even if they lose fidelity) only when needed from this central store.

This leads onto my next suggestion Having an optional central repository (like .mac) not only addresses some of the difficulties of maintaining fidelity through conversion but also opens up some other use-cases which the open source world has not addressed.

  • Online editing of tomboy notes while also being able to sync them on multiple computers (I can currently do this through backpackit.com but the conversions lose fidelity/information)

  • Synchronize desktop settings (wallpaper, theme, nautilus prefs, etc between two computers)

  • One click file sync with no configuration.

So lazywab hackers. Who here is a Django hacker that wants to write me a web service to do this. I already have a documented API that you could write against. I have the spec, the idea, but not the time. I had considered writing this myself, and setting it up as a paid service (I would pay a few dollars a month to be able to do this) but dont have the time, nor the web development experience, nor an inclination of artistic ability to make it look nice! Any takers, please email me!

Miscellany

  • John has finished up all the remaining pieces to allow iPods to be hotplugged/removed in conduit, so that synchronization settings and state are preserved.

  • He is also hacking on direct computer-computer sync over local network. Using avahi for discovery and pickling data over the wire. Should be awesome when this lands!

  • Unfortunately I was not eligible for summer of code this year, I am on holiday from university for six months and wont be starting PhD till after the summer. I was pleased to see mention of Conduit in the Ubuntu SOC ideas page, so I am glad to see distros recognizing the importance of making synchronization easy!

  • Releasing 0.3.0 is blocking on my having a working Ubuntu Fesity install to test against. (Lazyweb request: Does suspend,resume and hotkeys for panasonic sub notebooks such as the CF-R4 work on Feisty out of the box - it has not done so in previous releases)

Update: Google just released python GData bindings. This will make excellent support for Google Calendar, Google Notebook and a few other ideas I have up my sleeve a reality in the near future!. My blog is also back up, sorry for the trouble.