Posted by
jamie
on Wed Aug 10, '05 09:48 AM from the roundup dept. OSCON 2005
was held in a convention center this year, instead of a hotel, because it just
got too big (2000+ people). Too big, in fact, for pudge and myself to cover
more than a fraction of the talks and the ideas flitting around the hallways.
But here's some of what I found cool last week. And if you attended or
presented at OSCON and want to tell us about all the neat stuff we missed,
please, share your thoughts in the comments, or
submit
a fact-rich writeup and we'll maybe do a followup story later.
Mike Shaver's talk on writing
Firefox extensions
was packed to the walls. If you've been wanting to try it, Firefox 1.5
makes development easier, and should be out soon, so now's a good time.
This talk and the tutorial on
Ajax
persuaded me to start using the DOM Inspector and
debugging some JavaScript
to get a better understanding of webpage manipulation.
Aaron Boodman's talk on his extension
Greasemonkey
was a walkthrough of writing a simple GM user script, a discussion of what's
coming up, and some Q&A. Greasemonkey 0.5 ("Now With Security!") is in
beta: there are multiple security changes that suggest someone really has sat
down and thought the whole model through. GM works with Firefox, Seamonkey,
Opera, and Windows MSIE (but not, oh please somebody correct this oversight,
Safari).
Ruby on Rails
is hot; if you want to develop a web app quickly you can't ignore it. It
stresses "convention over configuration" with reasonable defaults. The
tutorial went from installation to the "hello world" of the web, a blog (!),
in a few hours. Anyone have a real-world example of Rails scaling to a large
project and lots of traffic?
DarwinBuild is
an open-source project from Apple that aids in building the open-source
components of Darwin/Mac OS X. Given a build number of Mac OS X, it will fetch
and build the software for that version, allowing you to modify the source
as needed, making it easy for any developer to modify everything from the
kernel to various utilities (just remember to reapply the modifications after
running Software Update, if necessary). You can read more about it from, in
addition to the web site, the
presentation
slides.
Google and O'Reilly gave out the
2005 open source awards,
with $5000 attached to each. Congratulations to the winners.
Tony Baxter's
Shtoom
is a cross-platform VoIP client and software framework, written in Python, for
writing your own phone applications.
Novell is still moving its employees from Windows to Linux, which
we first heard at last year's OSCON. The migration from Microsoft Office
to OpenOffice is complete, and the big step, from Windows to Linux, is 50%
complete, projected to be 80% by November. Miguel de Icaza gave flashy demos
of some Linux desktop applications that didn't impress this cynical observer
very much.
PlaceSite
is an open-source project looking to bring physical proximity awareness to
Internet access at coffeeshops and other meetingplaces: think "local-only
Friendster" and you're not far off. They got feedback
from a monthlong trial earlier this year and are working on a new version
that will be easy to deploy. Could be neat.
In a great 2-hour session on Wednesday, we got to hear from
representatives of four leading open source databases about what they've been
working on lately. Here are the summaries...
Ingres r3 has
an impressive list of big features. Ingres was just open-sourced by Computer
Associates this summer, and it's gotten a lot of attention for being a
full-featured enterprise database. Ingres supports table partitioning that can
be either range-based or hash-based, which can greatly improve performance in
many cases. Its optimizer can now come up with parallel execution plans, which
can be useful even on single-CPU machines and non-partitioned tables. There's
also federated data storage (one can access data stored in another RDBMS
through Ingres) and replication. And they're working on a concurrent access
cluster, to allow data to be manipulated not just by multiple threads on one
machine, but multiple machines.
A side note: Computer Associates was invited by O'Reilly to talk about
its recently open-sourcing Ingres. Its representative, while confessing that
introducing a new license was "probably the wrong thing to do," said that
other licenses wouldn't have worked for them (the GPL "was seen as viral").
The one question that the audience had time to ask was "is Ingres a dump" --
is CA making it open-source to transfer the responsibility of support from the
company to the community? The three-part "no" answer was that there are more
CA developers working on Ingres now, that Ingres is at the core of their new
releases, and that they've sponsored a "million-dollar challenge" to foster
community interest. Time will tell I guess.
Firebird 2.0
has been in alpha since January and a beta is expected soon. Since 2000 much
of their development has been aimed at making the product easy to install, and
making the code easy for a distributed group of developers to work on. This
year they're building features on that groundwork. Their design includes
2-phase commits (since the beginning), cooperative garbage collection (as a
transaction encounters unneeded data, it removes it) and self-balancing
indexes. Backup has been improved. When 2.0 gets to beta, I'm going to check
this out, it sounds like very interesting technology (and apparently it will
install with four clicks!).
MySQL 5.0 is in beta, and has been
feature-frozen since April. Back in 4.1, its abstracted table-type has been
put to advantage with odd engines like Archive (only insert, no update);
Blackhole for fast replication; and an improvement to MyISAM for logging
(allowing concurrent selects with inserts-at-table-end). Their Connector/MXJ
lets you run a native MySQL server embedded inside a Java application. In 5.0
we're seeing stored procedures per the SQL:2003 standard, triggers, updatable
views, XA (distribution transaction), SAP R/3 compatible server side cursors,
fast precision math, a federated storage engine, a greedy optimizer for better
handling of many-table joins, and an optional "strict mode" to turn some of
MySQL's friendly nonstandard warnings into compliant errors. And they're
working on partitioning, ODBC, and letting MySQL Cluster's non-indexed columns
to be stored on disk.
PostgreSQL 8.1 is expected to
be released in November or December, after a feature-freeze in July -- and
it's an impressive list of new features. Their optimizer will make use of
multiple indexes when appropriate, which is pretty darn exciting. The
recommendation will be that in most cases it will be most efficient to have
only single-column indexes and let the optimizer figure out which combination
to use. They're implementing a 2-phase commit, they're bringing the automatic
vacuum into the core code, and they removed a global shared buffer lock so
they're now getting "almost linear" SMP performance scaling. I've never felt
the need for Postgres, but I'm definitely going to look at 8.1.
He looks awfully cheery for having no body and a set of crap headphones, doesn't he?
I know i'll get modded down for this, but ontopic: That Darwinbuild stuff looks pretty handy for say, upgrades in time without having to wait for Apple to stream then in via OS updates like they did with Server X.3.X. Also, i hope they next step is to allow X apps to run outside of the X11 environment, but at least semi natively. I don't really like the current solution of having to have 23489 apps running inside X11.app when you can inadvertently kill them all off with one errant Fruit+Q.
Anyone have a real-world example of Rails scaling to a large project and lots of traffic?
While theres no sites i know off with massive traffic that run rails, theres a few large projects. TextDrive [textdrive.com] run StrongSpace [strongspace.com] which is basiclly online storage using SFTP and RoR. Also theres a few from the creators of RoR, BaseCamp [basecamphq.com], BackPack [backpackit.com]...
There is no downside to using ruby on rails. Closest thing to a silver bullet since the web came out.
There are several significant downsides to using Ruby on Rails.
Firstly, the way that ActiveRecord works by default - generating classes at run-time based on database tables - is considered by many (well, me at least!) to be a very backward step, as it makes code vunerable to changes in those tables, and also makes portability of code between different databases non-trivial. There are far better ways to do this - the Python ORM Dejavu (in in which the data model is expressed as classes) is an example. Almost all modern development languages work this way - with the exception of RoR!
Secondly, Ruby is slow. There may be future JIT systems that help deal with this, but they are not there yet.
Thirdly, Ruby is changing, and it is likely (from what I read) that the next version will not be fully compatible, so any major project developed now in Rails will have upgrading issues.
So Ruby on Rails is very far from a silver bullet. It may a neat way to get small (in terms of code) websites up quickly.
I think that is good. Desktops are supposed to be boring (at least for business). To much eyecandy or things to be impressed about (3D flipping transparent rotating windows, everything animated, multimedia under every mouseclick) has nothing to do with productivity or doing business anymore. I think that Novell realises this much that they now that they can run their business on desktop linux (and they do), and that is does not really has to impress anybody. If somebody wants to save on licences and maintenance the next migration, just look over here is the message from Novell.
MySql getting XA is huge. In the retail world, a lot of companies are switching to J2EE-based POS applications. This requires a database in each store. The problem is that the J2EE servers need an XA-enabled database so that the JMS reads/writes can occur within the same transaction as the data being generated. This has historically ruled MySQL out, which would otherwise be the natural choice. I'm glad to hear XA will be supported in the next release as this opens up MySQL to a whole new audience.
Re:MySQL w/ XA by Tony Hoyle (Score:3) Wednesday August 10, @11:01AM
Re:MySQL w/ XA by coflow (Score:1) Wednesday August 10, @11:15AM
We, MySQL AB, license by host, not client. I am not sure where you got your information but it is incorrect. Can you tell me where you got your information? I would like to make sure we correct whatever the source of it was.
I have to disagree with MySQL being the natural choice. With both MySQL and the drivers being GPL a vendor of POS applications would either have to GPL his POS application, or pay for a commercial license for each sold unit. Neither appear to be particularly attractive.
XA support has been committed into the upcomming PostgreSQL release and is already supported by Firebird. Considering their licensing, both are better choices.
From the main site: "Rails is a full-stack, open-source web framework in Ruby for writing real-world applications with joy and less code than most frameworks spend doing XML sit-ups".... Curious: so how does this compete with other web frameworks in use (LAMP, J2EE,.Net, etc)? Pros, Cons? Any users/developers here?
Re:Ruby on Rails by thatedeguy (Score:1) Wednesday August 10, @10:11AM
Re:Ruby on Rails by RegularFry (Score:1) Wednesday August 10, @10:54AM
Re:Ruby on Rails by thatedeguy (Score:1) Wednesday August 10, @11:01AM
Re:Ruby on Rails by Anonymous Coward (Score:1) Wednesday August 10, @11:39AM
Re:Ruby on Rails by JLyle (Score:3) Wednesday August 10, @12:19PM
Re:Ruby on Rails by ramannoodle (Score:1) Wednesday August 10, @11:29AM
Re:Ruby on Rails by Anonymous Coward (Score:3) Wednesday August 10, @10:28AM
Re:Ruby on Rails by leoboiko (Score:2) Wednesday August 10, @11:01AM
"...maybe do a followup story later. " Ahh! So when things are reposted several times, they're actually followup stories. Sorry to be so critical, Taco.
I have a buddy who works in the Novell legal department. I asked him if he was being forced to switch from MSOffice to OpenOffice, and he said no. There is no way he could prepare his necessary documents with OO because of some features it lacked. Specifically he said they had problems with generating the kinds of tables they needed.
Further, he indicated that they were not going to be forced to switch. I wonder if that 100% change that Miguel indicated was for the technical and support staff only.
Anyhow, I decided to download and try out NLD when I got back from the conference. It failed to recognize my monitor (19" Dell flat panel with DVI interface) and sound didn't work even though it recognized the card. On recommendation from a friend, I tried Ubuntu the other night and it worked with everything (except the printer needs some driver change to work which I haven't done yet).
"Ingres was just open-sourced by Computer Associates this summer". Wel yes and no....
Ingres was the very first ever relational database. It pre-dated the wide spread use of SQL. It was released under a BSD style license and was Open Source before the term "open Source" was in wide use. Later Ingres was further developed comercially and sold as a product and the old Open Source version becames known as "University Ingres". It is the later comercial version that is now open sourceed.
PostgreSQL (aka Postgres) comes from the same university develoers as Ingres. It was hier next DBMS hencethe name post-gress for after-ingres Postgres has linage dating back to the first first RDBMS. Postgres too was started before SQL wa universal but was converted over to accept SQL in 1995 and then later renamed "PostgreSQL" (with the "QL" being silent)
Bottom line, Ingres was the grandfather of open source DMBSes and has severa importent children.
One of the highlights for me was the talk by Dries Buytaert, founder of Drupal, on Thursday.
Drupal is way ahead of Ruby on Rails in terms of flexibility, scalability and implementation, IMO. They work in different spaces (Ruby hosting is scarce, though there are a few) but the clean architecture and extensibility of Drupal while remaining fast and small is exciting.
Although I applaud Ruby at finally getting an MVC-based framework together, I don't see what all the hype is lately. MVC-based frameworks have been around for a long time, and MVC-based frameworks in other OSS languages have been around since RoR came to be as well.
Java has Struts and others, Perl has an excellent Framework called Catalyst based on another MVC framework by Simon Cozens called Maypole - see Jesse Sheidlower's article [perl.com] on O'Reilly for building an AJAX-based framework in 30-lines of code or less in Catalyst. PHP even has one that's been out for awhile called Fusebox that is based off of another for Cold Fusion. What is it that is so special about Ruby on Rails?
So, I'll prefface this by saying that "cost saving" measures at my day job kept me from registering for the full conference. However, I did get in to the keynotes on Thursday. I went then specifically so I could hear Robert Lang talk abou Computational Origami. Really cool topic. I just wish I could have had him sign one of his books for me:P.
3 favorite schwag items: black rubber duckies from BlackDuck, 64 MB usb thumb drive from intel, and stuffed Dicey
While PostgreSQL usually doesn't seem to get decent press from O'Reilly, it practially dominated the database talks at OSCON. Just about every database event either included or featured PostgreSQL.
The things in the works for 8.1 and 8.2 are looking very interesting indeed. Besides 2-phase commit and bitmap indexes:
- Full multi-master replication with Slony 2 (Slony 1 is single-master) - IN/OUT paramater declaration for more flexibility in functions (making PL/PgSQL even more like PL/SQL) - Much more useable and flexible custom datatype creation. "Complex" dataypes may not seem too useful for standard business databases, but they have all sorts of applications in areas like mapping, engineering, scientific analysis, etc... (Not to mention which, you might even want to think in terms of mapping custom datatypes to classes in your applications.) - Horizontal partitioning: this is a concept used in very large tables, where you might want different groups of rows from one table stored in different locations, such as different physical disks, SANs, etc... This sort of thing is actually already being done informally by leveraging PostgreSQL's table inheritance features and tablespaces, but in the future (8.2 or 8.3?) it should become a standard feature.
Speaking of which, how many people caught Josh Berkus and Joe Conway's talk on "Terabytes of Business Intelligence"? Very interesting insights on how to handle very large databases in PostgreSQL. For example, Joe Conway's project gathers statistics from industrial equipment around the world, receiving several GB a day of data on the central server, an 8-CPU XEON, storing the data on a SAN array via NFS mount configured for jumbo frames. To handle the super-large main table, he created a partitioning schema where there are 12 sub-tables, each holding the data for one month of the year, and then creating a main table via inheritance [postgresql.org] from these tables to present a unified relation of all this data. In some ways this may sound like a view from a UNION query, but the implementation has much better performance and maintenance implications.
All this means PostgreSQL is steadily gaining ground on the Big 3 database vendors, and in some ways surpassing them, as far as the quality of the implementation. Many of the distinctions now are in external areas, such as application servers and federated systems. (You can do either of these things in PostgreSQL, but there is no official standardized method)
We all know that the main DB vendors don't allow anyone to publish benchmarks without permission, so of course there are no easily accessible benchmarks between PostgreSQL and Oracle, for instance. But, in informal talks at OSCON, I found at least a couple companies who had done their own internal benchmarks and PostgreSQL came out ahead surprisingly often.