Archive for July, 2008

MySpace and Opensocial readings

Tuesday, July 29th, 2008

First read the introduction on Opensocial

Then read about the sample Hello World application

Then read about developping an app

Some interesting Memcached readings

Friday, July 25th, 2008

Memcached internals

Memcached vs APC

Memcached as simple message queue

Use Gearman to avoid cache miss stampedes

Scaling memcached at Facebook

Speed test: ON DUPLICATE KEY UPDATE statement vs (DELETE then INSERT)

Thursday, July 24th, 2008

Is it faster to issue two SQL statements (DELETE then INSERT) or craft a single query using ON DUPLICATE KEY UPDATE?

Single query:

INSERT INTO items SET user_id=3463574, item_id=1, amount = 1 ON DUPLICATE KEY UPDATE amount = amount + 1;

Multiple queries:

DELETE FROM items WHERE user_id=3463574 AND item_id=1;
INSERT INTO items SET user_id=3463574 , item_id=1, amount = 66;

Difference in time (using microtime() in PHP):
ON DUP – 0.00023600000000001
ON DUP – 0.00022699999999998
ON DUP – 0.00014599999999998
ON DUP – 0.00014899999999998
DEL AND INSERT – 0.000336
ON DUP – 0.00015000000000004
DEL AND INSERT – 0.00046600000000002

The del/insert is 3 times slower than the ON DUPLICATE statement.

Change a column type along with the table engine in one query

Wednesday, July 23rd, 2008

You have a table to migrate from MyISAM to InnoDB, and while you’re at it, you want to modify the definition of some columns.

For instance we want to change the engine of the following table, along with changing the “amount” column to an unsigned int:

mysql> show create table items;
| Table | Create Table                                                                                                                                                                                                                          |
| items | CREATE TABLE `items` (
  `user_id` bigint(20) NOT NULL,
  `item_id` tinyint(4) NOT NULL,
  `amount` int(11) default '0',
  `created_at` timestamp NOT NULL default CURRENT_TIMESTAMP,
  PRIMARY KEY  (`user_id`,`item_id`)

You could do that in two queries…:

Query OK, 0 rows affected (0.12 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> ALTER TABLE items ENGINE InnoDB;
Query OK, 0 rows affected (0.18 sec)
Records: 0  Duplicates: 0  Warnings: 0

… but in practice it will take twice as long as a single query, as every time MySQL will have to make a copy of the whole table anyway.
So here’s the single query:


This operation is extremely slow so keep that in mind on your production server.

For instance it took 6 minutes to execute on a staging server with only 700k rows.

The production DB has 8 Million rows…

Import a table from another DB

Tuesday, July 22nd, 2008

If the tables have the same schema, you can directly do a subquery with a SELECT *:

INSERT INTO pirates_myspace.users SELECT * FROM mafiawars_myspace.users LIMIT 500;

However if there is an auto-increment field in the schema, you will have to select NULL for the auto-increment column…:

INSERT INTO pirates_myspace.items (SELECT NULL, now(), modified, expiration, type, category, name FROM mafiawars_myspace.items WHERE id >=150);

… meaning you will need to type one by one all the column names you want to copy over, which in most of the cases mean all of them minus the auto-increment field.

View your changes before SVN up

Wednesday, July 16th, 2008

You changed your files locally, all is neat and clean and you are ready to commit.
But you know others may have worked on the same portions of code you worked on, so you fear that doing “svn up” before committing will pollute your neat local files with conflict resolution code sections.
You need at least a heads up, here’s how:

svn st -u

It will diff your local files with the current head on the server.

Recursive diff: which files were changed?

Tuesday, July 15th, 2008

Here’s how to tell which files were changed between two directories, if you don’t need to know the details of the differences:

diff -Bbrq folder1 folder2

-B = Ignore changes that just insert or delete blank lines.
-b = Ignore changes in amount of white space.
-r = When comparing directories, recursively compare any subdirectories found.
-q = Report only whether the files differ, not the details of the differences.

It comes out handy to compare two directories checked out of SVN (for instance the prod folder and the staging folder).

LEFT OUTER JOIN from outer space

Friday, July 11th, 2008

If you want to join two tables but one has only partial information, you cannot use the implicit join, you need a LEFT OUTER JOIN.

Here’s a concrete example using two tables, Users and Groups:

1 Pierre
55 David
66 Roger

1 55
1 66
1 -1

If you want to list all the members of the group belonging to Pierre, you expect to be returned IDs 55, 66 and -1.

Doing it the implicit way will not return -1, as he’s not listed in the Users table:

SELECT member_id, name FROM groups, users WHERE member_id=id AND leader_id='1';

Using the LEFT OUTER JOIN will return this ghost user:

SELECT member_id, name FROM groups LEFT OUTER JOIN users ON groups.member_id = WHERE leader_id='1';

Here is a good explanation on LEFT OUTER JOIN.
(They use these tables in their example)

Remove new lines at the end of PHP scripts

Thursday, July 10th, 2008

New lines at the end of PHP files cause problems when including the file as you may not want this extra line.

For instance it messes up redirections in Myspace applications.

PDO cannot rewind or seek

Wednesday, July 9th, 2008

Basically it means you cannot retrieve one column at a time, as the first column you retrieve will move the pointer to the end of the data, and as you cannot rewind, the next call to retrieve the second column will return nothing, because you’re already at the end of the data.


$sql = "SELECT response, response_comment FROM responses";
$stmt = $dbHandle->prepare($sql);
$stats = $stmt->fetchall(pdo::FETCH_COLUMN, 0);

Then if you proceed to fetch the second column… :

$stats = $stmt->fetchall(pdo::FETCH_COLUMN, 1);

… it will return nothing.

You would have to execute the query again before running a second fetchall().

If it had worked, it could have been used to get the 2 columns in 2 arrays, then run an array_combine() to get the columns of the first array as keys and the columns of the second array as values. Too bad.