20 November 2008

Changing the feed links!

I updated the feed address to http://feedproxy.google.com/bigcurl.
Redirections from the old url are in place but it is better if you update your subscription right away.

19 November 2008

Amazon S3: Save money by setting the cache header appropriately

Maybe I'm late to the party but I just discovered that you can set the header Cache-Control in Amazon S3.

What does setting the cache control header mean?

Straight from the rfc:


The Cache-Control general-header field is used to specify directives that MUST be obeyed by all caching mechanisms along the request/response chain. The directives specify behavior intended to prevent caches from adversely interfering with the request or response. These directives typically override the default caching algorithms. Cache directives are unidirectional in that the presence of a directive in a request does not imply that the same directive is to be given in the response.


Explained in simple words this means that every cache between you and the requesting browser will keep the file for the timeframe specified in the header. Even the browser itself will cache the file and if a user encounters for example an previously in s3 uploaded image, it will not request that image again.
Everybody wins. The page loads faster for the enduser and you have to pay less for traffic.

If you use ruby and the rightscale gem, here is a simple request setting the header right for caching it. Start irb:

require 'rubygems'
require 'right_aws'
s3 = RightAws::S3Interface.new(ACCESS_KEY,SECRET_ACCESS,{:multi_thread => true})
s3.create_bucket('test.bigcurl.de')
s3.put "test.bigcurl.de", "untitled.txt",'Cache me if you can!',{'Content-Type'=>'text/plain','Cache-Control' => 'public,max-age=31536000'}


the result from the web looks like this:


curl -I s3.amazonaws.com/test.bigcurl.de/untitled.txt
HTTP/1.1 200 OK
x-amz-id-2: QWpqMS6h32b+
x-amz-request-id: ECC1EF0ABCAA0AD6
Date: Wed, 19 Nov 2008 00:27:19 GMT
Cache-Control: public, max-age=31536000
Last-Modified: Wed, 19 Nov 2008 00:23:35 GMT
ETag: "ce114e4501d2f4e2dcea3e17b546f339"
Content-Type: text/plain
Content-Length: 14
Server: AmazonS3


Pretty cool and since the header is still intact after Amazon Cloudfront, you'll benefit there as well.

Keep in mind that some files are more suitable than others for caching. Every static content like pictures, css files, javascript files are good candidates for caching. Dynamically generate data in which the content might change over time are no good candidates.
So be careful on which filetypes you increase the cache control time. You can not expire the file later via the server.

To avoid at least a few hickups implement a versioning mechanism like this: flower.jpg becomes flower.1.jpg. If you want to upload a newer version of the flowers pic you simply increase the number like this flower.2.jpg and it will be instantaneously available in the cache and as you generate new links with the new filename in your app it will not serve the old file anymore.

15 November 2008

Using a git frontend with a subversion backend Part 1

I find myself going back to a few unrelated articles about how to use git-svn because I keep forgetting the exact commands. Also not everybody is familiar with the concept of using subversion (svn) and git together.

Why should someone mix those two?

The answer is, like usual, not so simple.
  • Maybe your corporate policies do not allow to use another source code management system (scm).
  • You have existing infrastructure that you can not or do not want to change.
  • You do not need the distributed features of git but like the command line tool because it is a little more closer to the metal than svn.

These are more or less the reasons why I want to use it that way. git brings some refreshing ideas to the command line but I do not need the distributed features that are often associated with it.
Also I find myself working on a regular basis with other developers who need to use Windows for development.
Windows support in git is not the best if it exists at all. svn on the other side is pretty good at it. Having the ability to support both platforms and maybe something else in the future is a good thing to have.

So what are the pros and cons of using either tech:
Subversion
  • Longer around: This can be either a good or a bad thing but svn was already a big icebreaker for many companies to start moving away from cvs.
  • Many companies have existing infrastructures and see no reason to move away from their investment
  • Backups: This hit one of my mates once. A headcrash on his laptop with more than 40 local commits in git and no backup. "Stupid", you might say but it happens more than you think. git with all its possibilities to change a commit after you commited, keeps many developers from pushing there changes to a public viewable repository. "Maybe I forgot something and need to tweak it again", is an often heard excuse and discussed here in more detail.
  • Path authorization: Some people want the ability to hide certain parts of a repository from others
  • Apache integration: No need to run a special daemon to host a repository. Most sysadmins are familiar with apache and do not need to learn a new tool or setup new tools to monitor the new daemon.

  • Platforms: Like I said before subversion supports many plattforms including windows.

Git
  • The index: The possibility to have a staging station where you can "park" things you want to commit on a patch per file level is awesome.
  • patch -i: Commiting only a certain part of an edited file was something I always emulated by using a graphical diff tool. In git this is support natively.
  • rebase: Forget to check in a file ends up with a new commit in subversion that is not related to the initial one where you forget the file. In git it is possible to rearrange commits and "squash" them together into one.
  • merge: Merging in git is easy. Merge is easy in subversion too now but it first started getting attention when git came around and showed how easy it could be done.
  • stash: Stashing is the ability to take changes you have done to a local branch and copy them over to a newly created local branch.
  • Offline commits: You do not need a working network connection to commit your changes.
  • Http/git protocol: git over http is only one way. It does not allow to push changes back from your local repository to the public repository. If you want that you have to run the custom daemon and use the git protocol.

git's distributed features are great if you work on open source projects as it encourages forking the repository and generally more open development. In a more corporate environment you often do not needs these features and luckily git has git-svn which bridges both worlds.

The following articles will show you how I use git in a pure svn environment.