Many aspects of the Java platform have improved tremendously over the past years. Sun has always focused a lot of energy around improving “business” features. Hopefully, they’ve recently diverted their focus on Desktop issues. In this regard, they’ve adressed performance (Hotspot compiler, refreshed OpenGL / DirectX based Java 2D pipelines), data binding (JSR-295), Swing Application Framework (JSR-296), new look-and-feel (Nimbus) etc… One area where most developers agree that Java is still lagging behind the hip / trendy frameworks (Flex, AIR, .NET 3.0, Silverlight…) is definitely video !
The way I see it, Java’s main advantage is the overall consistency and cleanliness of its APIs, which makes it an ideal academic language and encourages high quality object-oriented designs. After all, it’s not about the programming language itself… even though I generally find Java source code sexier than other languages.
Last summer, when I started working on the mini multitouch table project, I got up to speed with the state of video in Java. I realized that the Java Media Framework hasn’t been updated in ages and its APIs feel pretty old by today’s standards. On the Mac OS X platform, Apple willingly dropped support of the Quicktime for Java bindings, which currently forces Java developers to rely on very very old Quicktime interfaces, that do not benefit from the major Quicktime overhaul made in Tiger (10.4). I’m refering to full access to the features of newer codecs (H.264) and high performance capture interfaces (QTKit).
A few days ago, I attended a couple of the final project (projet synthèse) presentations by students of my university. A friend of mine, François Caron, presented his implementation of an error correcting codec for live H.264 broadcasting (such as live TV feed or mobile videoconferencing). Let me point out that it is a lot more challenging to do error correction in these contexts as you don’t have access to pixels in next frames when correcting the current frame…. unless you buffer the data, which is totally unacceptable for videoconferencing. François implemented a new optional feature available for RTP packets (RFC3984) which allows to specify the decoding order of the macroblocks (top-to-bottom order VS checkered pattern). The benefit is that if, for instance 50% of RTP packets are dropped (or delayed) you still have a lot more chance to have a nicely scattered set of pixels in the buffer. This allows you to apply bilinear interpolation between the macroblocks of the checkered pattern for spatial error correction. Also this increases chances to have more good pixels from previous frames when doing temporal error correction. As I’ve always wanted to experiment more with H.264 in general, I started looking for alternatives to JMF.
FFMpeg, GStreamer and VLC are pretty much de facto open source libraries for C/C++ video development. I was pleased to see that all of them have wrappers for Java: JVLC, FMJ/FFMpeg, GStreamer-Java. The level of support and quality of most of these abstraction is pretty crappy though….
Obviously there is still a lot of work to do and I hope that Sun and Apple will sort this out soon (hint: Java One 2008 is in two weeks). In the meantime, I’ll have to resume learning Objective C and refresh my C++… in any case it’ll be useful to do some more useful iPhone SDK hacking.
A couple of days ago, Steve ended the dreadful 6 months wait after he announced that Apple was going to provide a full blown iPhone / iPod Touch SDK. Needless to say that I was among the people who caused the Apple Developer Connection servers to collapse by desperately trying to gain access to the sign-up page for the program. I was really impressed by the announcement and the general quality of the SDK that I got to experience.
Although, I was a bit sad to see that they still havent’ ironed out the issue of providing a clean object-oriented abstraction for multitouch gestures. That has been the source of my nightmares for the past year or so
Sometime before the iPhone came out last year, I was working on a small scale multitouch project involving the use of dot-matrix LEDs (see Jeff Han’s site to understand what I’m talking about). Unfortunately, I never got around to make it work since it required a deep understanding of microcontrollers and low level electronic circuits timings. I initially wanted to create a cheap reconfigurable multitouch USB device by lighting up specific LEDs designating “soft” sliders and switches. In the meantime, Yamaha came out with the Tenori-On, a musical synthetizer that also uses a grid of LEDs. I’m not sure that actually sensing touch using the technique demonstrated by Han and carefully described in this Carnegie Mellon paper.
Now, I ruminated many times over this problem of abstracting multitouch gestures, especially during the time I worked on my infrared LED-based mini multitouch table project last summer. I still wasn’t satisfied with my design until I saw the GestureMatch sample code provided by Apple with the SDK. Eureka ! The missing piece of the puzzle to complete the abstraction was basically there, in code form ! They are using cubic and quadratic Bezier curves to store the information and they use some geometry to check if the path closely fits the parameters of the curve. It is pretty rough and incomplete code, but seeing how this code works is enough to get ideas flowing in my head on how to generalize and add direction and timing support.
I briefly touched on the fact that I was working some photo editing software project back in the fall on my blog, but I never got around to discuss how it works and what it does. The project is called SAPT (Smart and Automated Photomontage Tools) and was the term project that my team and I chose to do for the GTI664 (Traitement de signaux numériques) class. I led a team of four people into making this software. I was responsible for the core algorithm based on the “Poisson Image Editing” paper.
The gist is that this software allows you to load a picture, loosely select a portion of it, load an another picture, drop the selection onto it and let the algorithm do its magic…The algorithm, through some black magic solving of a linear system of Poisson differential equations, seamlessly merges the selection in the background, effectively creating a photomontage.
Now, this algorithm has its flaws as it can’t do a great job at smoothly computing the edges if you happen to hit high-energy chunks of pixels near the borders (i.e where the divergence is too strong). What’s nice is that we’ve designed the software architecture to be really flexible and let us (or others) build upon the existing algorithm. In fact, when I get some spare time I wish to implement a preconditionning algorithm detailed in the “Drag and drop pasting” paper. This preconditionner runs a shortest path algorithm (Dijkstra or A-*) along the edges of the selection (dropping the inner portion) to find the flow of pixels that minimizes the divergence (i.e the Laplacian of the scalar field of pixels). I’m really proud that I was able to come up with an all-Java efficient design (generally selection of 100×100 pixels take less 3 seconds to compute on a Core Duo proc.). Thanks to my use of the Java 5 concurrency framework, it was a breeze to parallelize the computation.
You can try an early beta version, the one that our professor used for correcting the project in december (You’ll need Java 5 or later). As I said, it is far from perfection, but I had a lot of fun developing it last fall.
I thought I’d share some of my favorite UNIX commands (no, not the stuff for beginners), stuff I use almost everyday on OS X. Some of these commands are in POSIX, but a lot are Darwin / OS X specific (they are the other reason why I think OS X is cool, besides what Steve’s reality distortion field says).
Display ARP cache (useful to get MAC addresses of devices):
arp -a Flush local DNS resolver cache:
lookupd -flushcache Browse the LAN for multicast DNS (aka. Bonjour / RendezVous):
mDNS -B _daap._tcp local. Useful alias to scan 802.11 networks with Apple Airport:
alias ap=”/System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport” Update locate database:
sudo /usr/libexec/locate.updatedb Quick grep to search through all files under a directory hierachy:
grep -ri “super regexp” ~/Documents/ (for instance) Spotlight live metadata find:
mdfind -live [PATTERN] built-in image manipulation tool based on Quartz (replaces ImageMagick when you don’t have it):
sips RTFManpage ping broadcast address is always useful to find IPs and MACs of devices on the subnet:
ping 192.168.1.255 (for instance) Dynamically resize partitions (great to prepare Macintel for triple boot):
diskutil resizeVolume RTFManpage Darwin substitute for sed, powerful text manipulation, format conversion etc… (handles RTF, HTML etc…):
textutil RTFManpage Darwin’s super intelligent launch command :
open [somefile] Trigger interactive screen capture:
screencapture -i Launch Apple software update:
softwareupdate -l Interactively manage your OS X Keychain:
security -i Use OS X text-to-speech:
say -v “Bad News” “Mac OS X Leopard is gonna come out only in October of two thousand seven!” read magic bytes of any file:
file [somefile]
This is a work in progress. I’ll probably add more stuff in the next few days / weeks.
So I’m back in Montreal after 5 crazy days attending O’Reilly Where 2.0 and Google Developer Day 2007. I must say that I thoroughly enjoyed my stay in the Silicon Valley. There was so much information given during the numerous talks at Where 2.0 that I cannot summarize everything, but I am going to try to give you the highlights. By the way, all the links to stuff discussed were posted to my del.icio.us bookmarks tagged with “where2007″.
Tuesday kicked off with a talk by Schuyler Erle, which I had met at London WSFII two years ago. Schuyler is author of O’Reilly Mapping hacks, Google Maps hack and one of the developers behind the awesome open source Google Maps-replica, OpenLayers. He tagged about his experience with community “remapping”, that is when citizens create public domain maps using GPS tracks. He made great contributions to OpenStreetMap and more recently mapped streets of Mumbai, India. Mumbai is often referred to as a “Maximum city”, because of its incredibly densely sparsed population. Interestingly enough, this month’s issue of IEEE Spectrum also covers the topic of mega cities engineering. I can imagine that Schuyler’s work in Mumbai will be helpful to urban planners in this mega city that has big space allocation problems. Schuyler also discussed his idea that maps tell stories, that is to say that maps are great to use as a base layer for displaying data (example). (Read more on O’Reilly Radar)
Next up was Topix, a website that uses some sophisticated machine learning techniques to add locative metadata to blog posts, newspaper articles etc… Their categorization engine uses Tiger/Line data, lists of city mayors, park names, bodies of water, city demonyms etc… to identify locations, cities, places that the article is talking about and/or is from. For instance, you can choose to search for any blog posts, news stories or forum posts from or about Montreal.
Next up John Hankey, founder of Keyhole, now Google Earth, addressed Google’s contribution of KML as an open standard spec. through OpenGeoConsortium in 2006. He discussed their vision of the geoweb as they are working on Google Maps and Google Earth. They launched Google Street View on that day, which I think is pretty damn cool (look below on Immersive Media for more details on the technology used). He showed a cool demo of how Google Maps Mapplets can be used to create mashups of mashups (or meta-mashups), that is to combine many layers of data from various sources on the same Google Map (the key point here is that developers don’t need to coordinate, because they all use a common interface required by the Mapplets API).
Next up Quakr a website that displays geotagged Flickr photos on a 3d world map (kinda like Google Earth as a flash app in the browser). I didn’t think it was all that interesting (especially when there are KML feeds of Flickr photos that you can use to display the data on the map you prefer…). One thing that I remember though is that the guy said that a big problem with current use of tags / metadata is that it is often too simple to give proper semantic disambiguation, that is tell apart photos “taken FROM the Eiffel tower” and “OF the Eiffel tower”. Obviously it’s the fundamental idea behind all the work on Semantic Web standards (like RDF), but I still haven’t seen a fool-proof user interface to let users add this kind of metadata.
I’m skipping details on some less interesting talks…
Next up some rep. from the EFF (Electronic Frontier Foundation) gave an interesting talk on the privacy implications of all these location-based technologies on government surveillance… Poor U.S citizens :-p He showed a nice slide with a modified AT&T logo with their new slogan “AT&T. Your world. Delivered to the NSA”.
I was glad to hear Christopher Schmidt from MetaCarta talk about OpenLayers. OpenLayers is an awesome open source project that aims to create a vendor neutral abstraction to all map API (Google Maps, Yahoo Maps, MS Virtual Earth) and all open standards feeds (KML, WMS, GeoRSS…). I have been following this project for over a year, so I was glad to actually talk to some of the developers.
On tuesday night Sonya and I set up our booth for the Where Fair, which was in the lobby next to the main conference room. O’Reilly had printed a very nice poster with the description of iFIND. We talked to so many people that night, it was awesome. I really enjoyed explaining what iFIND does and how it works to some amazing people. Everybody who came to our booth seemed genuinely interested. Among others, I showed it to people from Nokia Research, Intel Research, Google, uLocate, Volkswagen Electronics Research Lab and MetaCarta. I had prepared a modified version of the iFIND client to simulate how it works on campus. I also had prepared a screencast video so that Sonya could also demo the software to other people on her computer.
The Where Fair lasted for three hours, then we headed to the bar where Skyhook was organizing their “annual beer bash”, basically free beer for everybody :-). We mingled with a lot of people and had a great time (though the party ended pretty early… something like 1am… well there was still one day left). We spent most of our time with the cool guys of Poly9 from Quebec City.
Highlights from day two:
Giving more details on the technology behind Google Street View, Toronto-based Immersive Media showed their 11-cameras capturing device. They actually use standard high-def. video sensors and capture at 30fps then their algorithm stitch frames together. They offer a 3U type server that captures the massive amount of data and adds GPS metadata. The guy said that their latest version of the hardware is able to give 1-inch resolution at 50 feet !
Another exciting moment was when Google Earth’s CTO demoed his Apple iPhone with Google Maps ! Unfortunetaly, since then we learned that the iPhone does NOT contain a GPS unit, nor does it will have access to GSM-based triangulated position…
I have to say that I have been pretty impressed with the venue where the conference was held, the professionalism of the crew, attention to details and the quality of the speakers. Sonya and I made some good PR work for the MIT SENSEable City Lab as we met so many people and told them about iFIND and what the kind of research the lab is doing.
So I ended up staying one more day to attend Google Developer Day at the San Jose Convention Center (one block from where the O’Reilly Where 2.0 venue). It was a geek paradise ! Everything was free (food, beer, conference, Google schwag etc…). Again, I mingled with tons of cool geeks… actually I ended running into a lot of the Where 2.0 attendees I had met. They launched Google Gears on that morning, which I was pretty excited about. I attended talks about Google Gears, the new additions to the Google Maps API and the Google system architecture. The day ended with a big party at the Google Campus. I was really impressed with the location. It sure looks like an amazing place to work ! I even had the chance to talk to Romain Guy, the guy who worked on Sun’s SwingX and SwingX-WS, that is the library I used to develop iFIND !
In a nutshell, this was an amazing trip to a geek paradise. I have to thank Brady Forrest, O’Reilly Where 2.0 organizer, for inviting me to present iFIND and Carlo Ratti, director of the MIT SENSEable City Lab for letting me actually go there !
Regardez cette pub de Windows 2.0 datant de 1986 où Steve Ballmer, actuel CEO de Microsoft, habillé en vendeur de balayeuse bon marché, parle des magnifiques fonctionnalités de son produit. À part le veston, il a toujours le même look aujourd’hui …
I got this brand new book written by Cal Henderson, Flickr’s lead engineer. Cal has a great tongue-in-cheek writing style, which makes reading deep Web applications architectural discussions a pleasure.
The book mostly concentrates on architecture, so it’s really not a programming book. It covers hot topics such as version control systems, i18n, Web apps security (XSS, SQL injection …), e-mail services, APIs … There is a nice chapter covering “Scaling Web applications” : load balancing, replication, caching etc… As this comes from one Flickr’s clever minds, it has a Web 2.0 feeling (APIs, lightweight architectures…), which I like a lot.
Over the last year, working for Flickr really became my dream job … still got one year before I get the ring !
OK, I’m seriously fed up with writing every two day about the new toy. But this time it’s so whicked and unusual that I really had to mention it. Enter Google Web Toolkit. Their tagline states : “Build AJAX apps in the Java language”. Huh ?! Yes, indeed that is totally weird. They say you can develop and debug as if you were writing a desktop app… But I have a very weird feeling about this thing, it feels so bizarre. Have a look at their demos and “Getting started” guides…
Sun Microsystems a annoncé, tel que les rumeurs le laissant entendre, son plan de rendre d’ouvrir le code source de Java.
“We will open source Java, we just need to figure out how,” said Green, referring to his desire to foster Java specifications and reference platforms compatibility along with the openness. [1]
Cette déclaration est très intéressante et on peut supposer que Sun envisage d’employer le modèle de la “cathédrale” (i.e. Eric Raymond’s “The Cathedral and The Bazaar”). Lors de la même journée de la conférence annuelle JavaOne, Simon Phiips a déclaré qu’il n’y avait désormais aucune limitation pour l’inclusion de la Java Virtual Machine de Sun au sein des distributions GNU/Linux.
Dans cet article, qui m’a bien fait sourire, puisqu’il y a très certainement une part de vérité, l’auteur explique que trop d’entreprises de la “nouvelle bulle” du Web 2.0, ciblent un marché de niche : “Too many companies are targeting an audience of 53,651. That’s how many people subscribe to Michael Arrington’s TechCrunch blog feed.”. Dans “The Myth, Reality & Future of Web 2.0″, l’auteur explique qu’à côté des véritables étoiles du Web 2.0 (Flickr, Digg, You Tube …), on parle trop souvent de bêta public à grande échelle. Plusieurs de ces sites n’ont même pas de vision à moyen terme, encore moins à long terme. On a qu’à défiler la liste qui compose les “SEOmoz’s Web 2.0 Awards pour s’en convaincre. Combien de ces sites seront encore présents dans 6 mois, dans 12 mois ? D’un autre côté, toute cette effervescence a du bon et, à mon humble avis, ne représente pas un véritable danger pour les marchés boursiers, puisqu’ils ont eu la douche froide il n’y a pas si longtemps et qu’en général ces entreprises ne sont pas à la recherche de capitaux très élevés. En moins de 18 mois, le buzzword AJAX est né, Google Maps a bouleversé notre façon d’intéresse avec la cartographie, c’est tout à fait normal d’être mis au courant des nouvelles photos de vos amis sur Flickr avec un agrégateur RSS, je suis inscrit à plus de 20 podcasts et vidcasts. Il s’agit là d’une avalanche, une folle ébullition d’idées sur les nouvelles technologies. La courbe s’accélère et de plus en plus, la longue traine facilite l’accès à des produits, qui autrefois étaient difficilement accessibles.
Bref, si vous êtes l’un des 53,650 autres, vous comprendrez que ce n’est que le début. On fait le débroussaillage de tout ça au quotidien et la crème de la crème sortira du lot.