Geekin.gs

geek writings 

Using Amazon AWS SimpleDB with Ruby roundup

On my old blog I published a couple of articles about Amazon SimpleDB and tools that can be used to use this distributed Key Value store from ruby.

Since the resources available in those articles and on the slides of my SimpleDB presentation are getting outdated or incomplete I thought I'll spend some time putting together a better collection of pointers useful for people who are looking at getting started with ruby and SimpleDB.

So here's the list of the projects that the SimpleDB beginner should look at:

  • Right AWS gem this is a gem that provides an API to interact with many of the Amazon Web Services, including a specific solution for SimpleDB
  • Helene is another ruby gem which provides an API for multiple, Amazon Web Services, among them also SimpleDB, of special interest in this case is the fact that Helene provides support for automatic data type conversion (SimpleDB internally stores only strings) and few other facilities like relationships (has_many etc) and validations in a very similar fashion of what activerecord does for SQL backends.
  • Raws is again a multi purpose gem implementing a really straight forward hash like API to interact with SimpleDB. Other element of interest of this gem is that is the only one which doesn't use net/http for managing connections to the Amazon Web Services choosing instead to use Typhoeus 
  • aws_sdb_bare is instead a more basic gem which provides classes to build requests and parse responses from SimpleDB. The gem by itself doesn't implement any http communication strategy leaving this open to the developer.

Obsolete/broken gems

  • aws-sdb is no more maintained and implements a signature level which is deprecated. On top of that the last version of the gem source code went lost when the original repository on github was deleted by the author
  • dead_simple_db is no more maintained, is a gem built on top of aws-sdb that provides basic automatic type conversion and ActiveRecord like feature to map objects in SimpleDB entries.

Filed under  //   amazon   aws   ruby   sdb   simpledb  

Comments [0]

Install google V8 on ubuntu hardy 64bit

This one required more googling than expected so here's the transcript. Please note this is for 64 bit systems

sudo apt-get install build-essential lib32stdc++6 scons
python is a prerequisite as well, but it should already be on your system In theory the process should be as simple as
mkdir v8
svn checkout http://v8.googlecode.com/svn/trunk v8
cd v8
scons
But since I was trying to compile using "scons snapshot=on" While trying to compile I faced 2 error messages
cannot find -lstdc++
Which was solved by doing
sudo ln -s /usr/lib32/libstdc++.so.6 /usr/lib32/libstdc++.so
And after I hit the error
/usr/bin/ld: cannot find -lgcc_s
Which was solved by
sudo apt-get install gcc-multilib

Filed under  //   google   javascript   scons   ubuntu   v8  

Comments [0]

Why and how CouchDB tickles my mind

The recent erlang factory in London gave us yet another wave of buzz about two software projects that grabbed my attention for the last few months: RabbitMQ and CouchDB.

In this post I want to concentrate on CouchDB, to be more specific I'm going to explain why CouchDB intrigues me because of how it changes the usual application stack and deployment.

The usual deployment stack

To start going through what a a current web application stack looks like I'll pick as a sample a Ruby on Rails app because I'm used to it and because for what I'm going to look at in this post is similar enough to applications built with other technologies.

1 - SQL database

This is the most common choice for the data storage for web applications. Whether it's MySQL PostgreSQL or others the process you go through to getting started is really similar:

  • structure your data in tables with columns
  • define datatypes for every column, if it can or not be null
  • define and enforce relationships among tables

Rails makes it easy and through migrations you define explicitly all the above and rails does the setup of all this in the SQL backend of your choice.

2 - Rails application

In the stack, the application sits on top of the SQL database and interacts directly with it.

To make this interaction happen there are actually two libraries that come in play a low level adapter that interfaces ruby with the SQL server and on top of that ActiveRecord.

If you would take a look at the code of these libraries, you'd notice that there's a lot going on only to mirror the information about datatypes and structure you defined in your database in your application logic.

While most of this mirroring happens behind the scene some does require again bits of configuration here and there, only to have back something you have already defined.

ActiveRecord allows to express a lot more than what can be described in the dtabase, but for what we care is interesting to know that there's a significant amount of code to ease the friction on the interface between our database and the application.

This friction is mostly due to the fact that application and database are complete alien to each other and they have no real understanding of the other end.

3 - The web server

This acts more as vector to move data from the application to the final user's browser

4 - The browser

Is this guy part of the stack? It is, there are a number of applications that have significant portions of logic living here. So the browser can still provide relevant functionality but once again it's a different world where and in some simple cases all what the Rails application has to do is fetch data from the SQL database, convert it in ruby objects, and render html or json to send this data on the browser.

The CouchDB stack

1 - CouchDB

CouchDB data storage model is really straightforward, there are databases and a database contains a collection of documents. Documents are javascript objects or at least that's how you deal with them from a developer point of view.

Documents are not restricted to any schema, but implicitly every document defines its own structure and datatypes.

On the interface CouchDB uses json and http as means to interact with the other levels of the stack, the fact it uses json means that the data coming out from CouchDB reflects at any level of detail the data that is stored in the database.

Alternatively is possible to generate directly from CouchDB output in other formats like html or xml.

2 - Browser

Since CouchDB exposes itself via http and is able either to render html or serve data in json in some cases we're able to squash the stack from 4 levels to 2.

If from CouchDB we serve json, we get for free the fact that the data as it comes on the database happens to be in a form the browser can understand and manipulates directly.

The definition of "distributed, fault-tolerant and schema-free document-oriented database" doesn't say much about the fact that CouchDB can actually be used as a stand alone solution to publish small web services.

As a rule of thumb for the people familiar with ruby and sinatra, if you can think of an application that you can write using a SQL DB and a Sinatra app of few hundreds lines, probably you can easily write the same app as a couchapp and host it completely on CouchDB

Some more detail

The 2 level stack is not always achievable, or recommendable, since still CouchDB is meant to be a database and not a self contained application hosting platform (and IMHO this is a good thing, since it gives to CouchDB the opportunity to keep the code base super tight and simple).

But the possibility to be able to simplify the application stack in some areas is just great.

Let's see an example, you have an application that stores user profiles, and users can chose which fields of their profile are public.

You can store in a CouchDB document the complete profile data including the list of fields that are meant to be public.

At this point you define a show function that renders the profile including only the public fields.

This way you get from CouchDB a url per each user which publishes the public profile and this url can be accessed directly as it is - here you go - 2 level stack.

If you're thinking of a way to handle private profiles then you can still render them from CouchDB and all what you need is just a thin layer of software sitting between the browser and couchdb that decides whether or not a user can access the url of the private profile; tricks like X-Accel-Redirect header and Rack middleware can play a big and intriguing part in building loosely coupled, scalable applications.

Last but not least comes the fact that CouchDB uses javascript as internal language to manipulate the data. This opens the possibility of using exactly the same libraries in CouchDB and in the browser.

Add on top of all this the replication capabilities of CouchDB and you've got a lot to think about.

Conclusion

I'm not sure if in the next few years I'll be working on apps using CouchDB but I can guess I'll be working on applications which include a lot of the ideas CouchDB is currently pushing.

Filed under  //   couchdb   javascript   nginx  

Comments [0]