PostgreSQL functions and triggers

It is desirable to avoid using database functions. They are black boxes which gift only night horrors to unwary developers. A system is difficult to understand, maintain and debug when chunks of it lurk unseen in the DB.

Despite this – for certain features – using them does make sense. Provided the database function’s code is source controlled and huge lumps of comments referring to both the location and function of said code is spread evenly throughout the associated application code.

For example, in order to bill Ekaya agents accurately we needed a log of show house status changes, from upcoming to active to past or cancelled. Most of these changes were implemented in sweeping SQL statements that, while efficient at their own task, made it difficult to track individual changes.

Implementing this in the application code would also mean that any new features that modified the show house status would require someone to remember to add business essential hooks to update the audit table. In the end it seemed simpler to create a trigger in the DB which watched the show house table and updated an audit table accordingly.

PostgreSQL functions

Triggers trigger functions. In order to have a trigger you must first have an existing function to trigger.

PostgreSQL functions can be written in a number of languages, including Python, but I found PL/pgSQL worked fine for me.

Functions are created using the CREATE FUNCTION command. I recommend using CREATE OR REPLACE FUNCTION as this will handily update the function when initially mucking about with it.

Examples of functions can be found in the CREATE FUNCTION page as well as the Trigger Procedures page. At the top of the Trigger Procedures page is also a list of special variables you can use in a trigger function. NEW and OLD are especially helpful.

In my experience I have found it prudent to ensure your trigger function always culminates on a RETURN statement of some sort. Otherwise triggers reguritate error codes at you. In my case making the last line of the function code RETURN NEW sufficed.

Once your function has been created you can view it in the DB using the following command:

SELECT proname, prosrc FROM pg_proc WHERE proname = 'function_name';

Should you desire to remove the function, you can do so in this fashion (note the brackets after the function name):

DROP FUNCTION IF EXISTS function_name() CASCADE;

NB running the above statement with CASCADE will cause any associated triggers to be dropped as well.

PostgreSQL triggers

Once your function has been created you can create your trigger to … trigger it. I recommend reading Overview of Trigger Behaviour and CREATE TRIGGER but in a nutshell:

  • Triggers are attached to a TABLE or VIEW
  • They are fired BEFORE, AFTER or INSTEAD OF an INSERT, UPDATE, DELETE or TRUNCATE statement that acts on their TABLE/VIEW
  • A statement trigger will always be triggered, but only once for each statement. A row trigger will be triggered only for each row that is affected by the statement.

Once created you can view a trigger like so:

SELECT * FROM pg_trigger WHERE tgname = 'trigger_name';

And they can be removed like so:

DROP TRIGGER IF EXISTS trigger_name ON table_name CASCADE;

Postgresql date formatting

Preamble

Formatting of data should only occur in the final steps of output. Until that point, and as a rule, internally data should remain in a base format that can easily be converted into many others – without being converted into another more basic format first.

For example Unix timestamp for dates or a floating point number for money. Both can readily be converted into more expressive formats without writing code to first parse or disassemble the initial format.

However in a situation where the flow is very specific and unlikely to ever be used to generate a different output it is permissible, even desirable, to generate data in the format it will be finally outputted.

PHP vs. the database

It is assumed that it is faster to get the database to perform such formatting and return requested the data pre-formatted than to loop through the data and perform the formatting in PHP script. As such we need to use SQL to tell PostgreSQL how to format the data.

SELECT to_char(NOW(), 'YYYY-MM-DD"T"HH24:MI:SSZ') as the_time;

Will produce something like:

       the_time       
----------------------
 2013-10-29T21:06:39Z
(1 row)

NB

to_char(timestamp, text) is the function that formats a date / timestamp into text. The first parameter is the timestamp to be formatted, and the second string is the format to use. The timestamp can easily be timestamp column.

You can find out more about the function on this page Data Type Formatting Functions. More info on the string format a little further down the same page, here: Template Patterns for Date/Time Formatting.

The ” (double quotes) around the T escape it. Without them the final output would have been 2013-10-29THH24:25:15Z as T is reserved character in the formatting language.

Lastly the as the_time can be omitted, in which case the resulting column will be named to_char.

Bash script to download SXSW podcasts

For the last few years I have listened to the podcasts from SXSW while travelling to work or more recently while at gym.

Each year the number of podcasts has increased to the point where this year I am now too lazy to click through to every page and DownThemAll!. Besides it’s more fun to write a bash script to do the heavy lifting.

You will need …

  • cURL
  • grep
  • uniq
  • wget
  • bash version 4 or above. If you don’t have bash 4, there are slightly different instructions for you.

Except for cURL, and maybe wget, everything else should come standard with your Linux distro. To check which version of bash you’re running enter the following at the command line:

echo $BASH_VERSION

Consult your distro’s documentation on how to install anything you might be missing and continue to the next section.

The script

Open your favorite text editor and copy and save the following into a new file (I called mine getpodcasts.sh):

#!/bin/bash
for i in {0..13..1}; do
    curl -s http://2009.sxsw.com/interactive/news/videos_and_podcasts/more?page=$i | grep -o 'http.*\.mp3' | uniq | wget -i -w 2 -
done

Let’s go through the script line by line:

#!/bin/bash
The first line tells bash what interperter to use for this script. You could replace /bin/bash with /usr/bin/python or /usr/bin/php if you had written the script in Python or PHP.

for i in {0..13..1}; do
This starts the bash for (start..end..increment) loop. This type of for loop will only work in bash v4 and above. If you are using an older version of bash, find out more about other for loops. I got the number 13 from the SXSW site. There are 13 pages.

curl -s http://2009.sxsw.com/interactive/news/videos_and_podcasts/more?page=$i
The next line will be broken up into its’ components. You can try each component from the command line to watch how the text output gets transformed as we add components to the command. First is the cURL instruction, here we’re using it to download and spit out SXSW HTML pages. cURL normally prints out its’ progress, but the -s tells cURL to be silent. The URL ends with an $i which is incremented with each pass of the loop. Essentially the for loop and cURL are downloading podcast pages one page at a time.

grep -o ‘http.*\.mp3′
Grep searches each line of the downloaded pages for anything that looks like an MP3 URL, -o tells it to print out only the matching part of each line.

uniq
Removes duplicates. The grep matches at least two of the same podcast on each page, and uniq condenses this down to one only. Ordinarily I would use sort -u to first sort and then remove duplication, but since the returned data is already sorted we can just use a uniq.

wget -w 2 -i -
Finally we start downloading the podcasts with wget. The -w 2 tells wget to wait 2 seconds in between each download. This is just to be friendly to the servers. The -i - tells wget to use stdin (the text piped through from uniq) as input file for download links.

Finally

Save your script and chmod it so that it can be executed:

chmod u+x getpodcasts.sh

Run it:

./getpodcasts.sh

It may take a moment while cURL downloads the initial web page (and then every 4 or so podcasts it will pause again), but soon you should be up to your ears in podcasts.

P.S.

If all you want is a list of podcast URLs in a file, there are 2 things you need to do. Firstly change the 3rd line to:

echo `curl -s http://2009.sxsw.com/interactive/news/videos_and_podcasts/more?page=$i | grep -o 'http.*\.mp3' | uniq`

ie. Drop the wget, add an echo in the front and wrap the whole line in ‘`’ (not sure what they’re called).

Secondly run the script like so to save the data in a file called podcasts.txt:

./getpodcasts.sh > podcasts.txt
file0001889523707

Inside every DVD is a small movie trying to get out. Part 1: A quick guide to K9Copy

The problem with backing up a regular store bought movie DVD is that it simply won’t fit into a normal blank DVD. The movie DVD is a 9GB monster, and the blank a svelte 4.4GB. There are two solutions to this: re-encoding and transcoding. This article is a quick guide to performing a transcode using [K9Copy](http://k9copy.sourceforge.net/) (similar to DVDShrink) on GNU/Linux. Simplifying terribly, re-encoding takes the better part of a day or longer and the result is an .avi or similar file. These are so much smaller than the original that you can fit four or more reencoded movies onto a single 4.4GB DVD, but they won’t play in a regular DVD player. On the other hand a transcode takes about an hour and results in a single movie on 4.4GB DVD, which will play on a regular DVD player. Transcoding works by lowering the quality of the movie to make it smaller. The smaller the desired end result, the worse it is going to look, but with a normal sized movie you shouldn’t be able to tell the difference. # You will need * To have successfully consulted your distros documentation and installed K9Copy. * I’ve heard that you will need about 8GB of hard drive space free. Never tested this. * Optional: Also have installed MPlayer. # Method 1. Insert the DVD you want to copy. 1. After it has loaded, open K9Copy. 1. Press Open. I think this is the biggest barrier to entry in this program. 1. K9Copy should load a tree structure showing all the titles available on the DVD. A DVD can contain up to 99 titles, one of them will be the movie you’re looking for. The others are things like menus, extras, trailers and warnings. As a rule of thumb the largest title is the one you’re looking for. To check you have the right one, run the following in the command line (replace n with a number from 1 to 99): `mplayer dvd://n` Other things to try while viewing in Mplayer is pressing “#” to cycle through the audio tracks in the title and “j” to cycle through subtitles. 1. Open the title you want. Don’t be confused by the titlesets, you want the title, in the image below the title I want is number 1 (mplayer dvd://1) but it is part of titleset 5 – I honestly don’t know what titlesets are). You can transcode the entire DVD: menus, extras, trailers and all, but those take up space which will leaves less space for the actual movie, and the end result may suffer. So it’s a tradeoff, I tend to only keep the main feature. 1. Under the title, select the video, audio, and subpictures (subtitles) you want to copy. In the image below I have selected the video and only one audio stream, for this movie I don’t need the other audio streams nor do I need any of the subtitles. 1. Select Copy, find somewhere to save it, and wait. First the DVD will be copied to your hard drive and then formatted into an .iso image that you can easily burn to a blank DVD. If you want to check your iso image before burning you can do so with this command (replace **filename.iso** with the name of the file you created): `mplayer dvd:// -dvd-device filename.iso` As before you can cycle through your audio and subtitle tracks to see that they are all there. **Note:** You will not be able to see any copied menus in MPlayer, these will only show up when you place the burnt DVD into a DVD player. Hope this helped you, in part 2, I will describe reencoding a DVD into an H.264 mkv file.
Bull-elephant-drinking-WEB_296731-590x393

PHP cURL over SSL

If you’re getting stuck trying to use cURL over https in PHP, try setting both the verify peer and verify host options to false:

$url = 'https://myverysecret.domain/secrets/';
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_exec($curl);
1280-rufus-wainwright-helena-bonham-carter-out-of-the-game-music-video-library-librarian

PAM and the Bad password blues

Warning: Only try this at home! Using weak passwords on a computer that this is accessable from the wild and dark Internet, is tantamount to walking up to a spammer and saying “I’d simply love to be part of your zombie network—where do I sign up?”. I could safely do the following because this server is not accessible from the Internet and never will be, it’s a local test box for my own personal use.

I was creating a new user on a local CentOS 5.3 VirtualBox and while I was setting the password I received the following error: BAD PASSWORD: it is based on a dictionary word

After soul searching I found I didn’t feel coming up and then remembering a complicated enough password to make PAM happy, ie. not a dictionary word, long enough etc.

So I spent a while reading up on PAM—which, as it turns out, is a small team of alluring ladies and well worth stealing a look at.

Turns out my problem has a name, and that name is pam_cracklib.so. Ms. CrackLib will diligently check a new password against her dictionary and then check whether it is significantly different from the previous version, whether it is long enough, etc. Much of what she does is negotiable, but the dictionary check in the beginning she won’t budge on.

So either we mess with her dictionary reading abilities—by say giving her a blank dictionary or hiding her glasses—or we take her out of the loop completely. I opted for the latter and set about cutting her out of my life.

The surgery took place in /etc/pam.d/system-auth. I took the following lines:

password    requisite     pam_cracklib.so try_first_pass retry=3
password    sufficient    pam_unix.so md5 shadow nullok try_first_pass use_authtok

and turned them into:

#password    requisite     pam_cracklib.so try_first_pass retry=3
password    sufficient    pam_unix.so md5 shadow nullok try_first_pass

Just commented out the pam_cracklib.so line and removed the use_authtok from the line below, otherwise passwd complains about Authentication information cannot be recovered

SQL – Find the last DISTINCT items

This took me a while to figure out, so thought it worth documenting. Here is a simplified example that explains the problem and solution, tested on MySQL 5.0.

An office worker keeps track of who she has contacted. After a while she builds up a table as follows:

Id contact
1 josef
2 harry
3 sally
4 pudding
5 pudding
6 sally
7 harry
8 sally

Now she needs to see who she contacted the most recently and in what order. In other words, with the above data she wants the following list: sally, harry, pudding, josef.

Her initial immediate reaction is SQL like this:

SELECT DISTINCT contact FROM table ORDER BY id DESC;

This returns the correct data, but in the wrong order: pudding, sally, harry, josef. This is because DISTINCT seems to remember only the first instance as it appears in the table, all the rest are considered duplicates and ignored.

After some effort, the solution turns out to be not to use DISTINCT, but rather GROUP BY and ORDER BY MAX() to invoke magic:

SELECT contact FROM table GROUP BY contact ORDER BY MAX(id) DESC;

Excitingly that returns the correct data in the correct order. We could still use a DISTINCT in there, but it is superfluous and would add unnecessary computation, GROUP BY does the job of DISTINCT.

The example could be further complicated. For example the the table could have a timestamp added (ORDER BY MAX would work on that too), and a user id so that multiple office workers could use it (include a WHERE user_id = x to find only for a particular user).

Monitoring bandwidth with bwm-ng

I discovered how to watch the bandwidth of server today using a tool called bwm-ng. It’s in the Ubuntu Hardy repos, you can install with the following command:

sudo aptitude install bwm-ng

When you run it in a terminal without parameters, it displays a running update of how much bandwidth is being used. This makes it a handy tool to get an immediate idea of what your network traffic is like, and there are also options to examine disk IO. Besides the “live running commentary” mode, it can also output in other forms as well, for example CSV.

To create a poor man’s ntop, i.e. generate a log file of a servers bandwidth throughput over time; I wrote a script that is cronned every minute, the interesting part looks like this:

bwm-ng -o csv -c 6 -T rate -I eth0 >>bandwidth.log

Here‘s a run down of what the parameters do:

-o csv
Output to CSV format. Annoyingly you have to download the tarball from the site linked above to get the file which contains the legend for the CSV generated. Why not just include it in the man page? Being the considerate type I have included this information later in this post.
-c 6
Print only 6 reads. Not 6 lines, but 6 reads of all the interfaces, including a total line.
-T rate
Type rate. Show only the rate, other options for this parameter are avg and max. Without this parameter the output also show amount of data and packets transferred.
-I eth0
Comma separated whitelist of interfaces to show. Can also be used to blacklist interfaces. Even with only one interface this still prints a total line.

This generates the something which looks like this:

1222420795;eth0;526.73;237.62;764.36;120;266;1.98;3.96;5.94;2;1;0.00;0.00;0;0
1222420795;total;526.73;237.62;764.36;120;266;1.98;3.96;5.94;2;1;0.00;0.00;0;0

And as promised the legend for the output:

unix_timestamp;iface_name;bytes_out;bytes_in;bytes_total;packets_out;packets_in;packets_total;errors_out;errors_in

All this data is then appended to a file called bandwidth.log ( >>bandwidth.log ).

Using lmc_tween in Flash 8 without installing the extension

My brain is made of honey. In my previous post on lmc_tween. I wrote I would explore ways of getting lmc_tween working in Flash 8 which did not require installing the extension.

The obvious solution that came to me five minutes after posting is to stick all the lmc_tween classes into a directory and include it in your classpath.

Disclaimer: I recommend this method only for working with old code that uses lmc_tween. For new projects I strongly suggest you use Fuse Kit instead.

  1. Go to the lmc_tween site.
  2. Scroll down to the “Ziped version” (sic), and download version 120. Direct link. (lmc_tween is no longer in development, this is the last version).
  3. Extract the downloaded file and copy the new directory to your source folder.
  4. Somewhere inside your code you should have

    #include "lmc_tween.as".

    You need to modify this to point to the new source directory. For instance I renamed the extracted directory “lmc_tween”, so my line changes to

    #include "lmc_tween/lmc_tween.as".

  5. Add your lmc_tween directory to your FLA’s classpath: File > Publish Settings > Flash Next to ActionScript version select the Settings... button and add your extracted directory to the list of classpaths.

  6. If you have previously installed the lmc_tween extension, you can disable it now by selecting:

    Help > Manage Extensions

    And untick the box next to Movieclip Tweening Prototypes. You will have to restart Flash.

Now you, and anyone else who works on this project, will be able to compile it without the installing extensions and overwriting Macromedia’s core classes.

Happy Minor Edits!

lmc_tween & Flash 8 – the Bain of my existence

Update 2007/05/23: I’ve found a better way of dealing with legacy lmc_tween code.


Maybe a little harsh, it’s definitely frustrating though.

Since I started my new job in November I have had to deal with legacy code left behind by the previous Flash developer.

I’m not one for working with another’s code to start with. I always get an itch to redo from scratch about a quarter of the way in, but time waits for no redevelopment. I end up hacking at code till it limps in the right direction.

The previous developer had a taste for lmc_tween.

One of lmc_tween’s pitfalls is that in order to use it to extend MovieClips you need to replace Macromedia’s core classes. Which means every person in the workflow who has to work with it needs to know about this, and mess with their core classes. This leaves me feeling uneasy.

Of course the previous developer extended his MovieClips in every project — it’s a natural way of working with objects.

And of course the above method works only in Flash MX 2004. Flash 8 just spits out errors and does no tweens. I have always managed to work my way around this, but tonight I decided to find a proper solution.

After lots of googling I found a very simple answer. Put the location of the two classes you need to replace in your class path:

Preferences > ActionScript > ActionScript 2.0 Settings

C:\Documents and Settings\user\Local Settings\Application Data\Macromedia\Flash 8\en\Configuration\Shared\zigo

This way your core classes remain untainted and the tweens bounce about happily.

This does not help the designers on their Macs, though. I’m going to investigate the possibility of not having to install the extension. Trying to explain to client how to install lmc_tween over the phone is an enriching exercise.

If you’re looking for programmatic tweens — the newer and friendlier Fuse Kit reigns supreme at the moment. It uses the same Zigo engine and Penner tweens as lmc_tween.

You do not have to install the extension, just place the classes in your source directory and import the classes normally. That way you can package the Fuse classes with the project anyone who comes into contact with it can build it without jumping through any hoops.

It can do more than that though, Fuse is a sequencer…