Symfony2 and RabbitMQ: Lessons learned

Last year we introduced RabbitMQ into our stack at Waarneembemiddeling.nl. We were in desperate need of a worker queue and after fiddling around with Gearman, Beanstalkd and RabbitMQ we made our choice: RabbitMQ it will be.

Now there’s quite some information to be found on RabbitMQ and how to use it, but a lot of things you have to find out yourself. Questions like:

  • what happens to messages on the queue after a service restart?
  • what happens to messages on the queue after a reboot?
  • how do we notice that a worker crashed?
  • what happens to my message when the consumer dies while processing?
  • etc.

Using RabbitMQ and Symfony2 (or php in general) is quite easy. There is a bundle for Symfony2 called OldSoundRabbitMqBundle and a php library called php-amqplib which work very well. Both are from the same author, you should probably thank him for that 🙂 .

First try: pure php consumers

We’re running a fairly common setup. Because we’ve been warned that php consumer die out every now and then, we’re using Supervisor to start new consumers when needed. There is a lot of information out there on this subject so I won’t go in there.

Despite the warnings we started with pure php consumers powered by the commands in OldSoundRabbitMqBundle. The first workers were started like this:

This means we’re consuming from the async_event queue without any limit to the messages. Basically this means it will run forever, or better said: until php crashes. Or worse: your consumer ends up in non-response state. Which means it doesn’t process any message any more and Supervisor thinks all is fine because you still have a running process. This happened once to our mail queue. I can assure you it’s better to prevent these kind of things.

Second try: pure php consumers with limited messages

So after the mail-gate I was searching for a quick way to make our setup more error proof. The OldSoundRabbitMqBundle supports limiting the messages to process. So I limited our workers so that they got restarted a couple of times a day:

After that things got running more smoothly and it took a while before we encountered new problems. While spitting trough the logs I notices some consumers produced some errors. A brief summary:

  • General error: 2006 MySQL server has gone away
  • Warning: Error while sending QUERY packet.

Because the consumer is one process that keeps running, that also means that the service container and stuff keeps existing in memory. When you’ve done some queries the database connection keeps open in the background. And if it’s quiet on our queue, it may take some time before we reach the message limit. If that time exceeds the connect_timeout of your MySQL server, you’ll run into the warnings and errors about lost connections.

Of course we should close the connection after the message is processed or could try catch for Doctrine DBAL connection exceptions or increase the connect_timeout setting but thats just denying the real problem. Running consumers with a booted Symfony2 kernel just doesn’t work so well.

A final resort could be to strip down the consumers and don’t use the Symfony2 kernel and container but we don’t like that. Our messages are most of the time serialized events which get dispatched again after the consumer picks them up. At the application level we don’t want to know wether we are in a RabbitMQ consumer or in a normal HTTP request.

Real solution: rabbitmq-cli-consumer

So it took a couple of months to learn the hard way we needed some different solution for our consumers. I found this interesting blog post about the same problem. He solved it with Java and Ruby consumers. We all learned java in college right, but I don’t like to run the memory eating jvm on our servers. The Ruby consumer unfortunately misses some good documenten for me as Ruby virgin. So I got a bit lost there.

That was the point where Go got in. Go is a kind of improved C with not real OO but a lot of cool stuff in it. I wrote a application that makes it possible to consume messages from RabbitMQ queue and pipe them into an command line application. I called it: rabbitmq-cli-consumer.

The main advantages for using rabbitmq-cli-consumer are:

  • no more stability issues to deal with
  • lightweight and fast
  • no need to restart your workers after a fresh deployment

We still use supervisor to start and stop the consumers because it’s the right tool for it. An example of how we start a consumer:

An example of a Symfony2 command we use:

Final tip: use the management plugin

Before even starting with RabbitMQ make sure you have the management plugin installed. It gives you a good overview about whats happening. Also you can purge queues, add users, add vhosts etc.

How to use Codeship with Symfony2, phpspec and Behat

My coworkers and I at waarneembemiddeling.nl are really fond of phpspec and Behat. Yes, we must confess: we didn’t test much since a couple of months ago. We skipped the phpunit age and started right away with phpspec and Behat. We also like services, so instead of setting up (and maintain) our own CI server, we use Codeship. To be honest we fell in love with Travis, but that was a little bit to expensive for us. And so our search ended at Codeship.

There is some documentation on how to use it with php, but its not that in depth about phpspec and friends. Let’s start with phpspec, as this is pretty easy. I’m assuming you install phpspec and Behat as dev dependencies using Composer:

phpspec

Now head over to codeship.com and edit your projects configuration. Pick “PHP” as your technology (didn’t see that one coming). In the “setup commands” field we first select the desired php version:

Next install deps (I believe this line is placed there by default by the codeship guys):

Then add phpsec to the “test commands” field:

Et voila, phpspec should now be functioning. 🙂

Behat

Behat is a little bit more difficult. The first problem you need to solve is to get the MySQL credentials into your Symfony2 application. These are provided trough environment vars, but differ from the naming convention in Symfony2.

We start by changing our app/config/config_test.yml:

Now to let Symfony2 pick up the environment vars we have to follow the convention I just mentioned. This means that an environment variable with the name SYMFONY__TEST_DATABASE_USER will be recognised when building the container. But let’s start by adding a bash script to ease the setup of the testing environment (locally and Codeship). Call it setup_test_env.sh and place it in the root of your project:

Then adjust your codeship setup commands and add:

Last but not least add the behat command to the test commands:

Things should be working now. Quickly enough you will run into the infamous xdebug “Fatal error: Maximum function nesting level of ‘100’ reached” error. Let’s fix this right away and add this in your setup commands:

Summary

So the complete setup commands dialog for phpspec and Behat together looks like this:

And the test commands like this:

Everything should be working fine now! To run your tests local don’t forget to first execute the bash script (notice the extra dot, it is required):

Happy testing! 😉

Slow initialization time with Symfony2 on vagrant

A few days ago we switched our complete infrastructure from hosting provider. Also we made the switch from CentOS to Debian. So we also got a new fresh development environment using Debian and Vagrant (and latest PHP and MySQL ofcourse :)).

We expected the new dev box to be fast, but the oppositie was happening: it was slow as hell. And when I mean slow as hell, it’s terribly slow (10 – 20 seconds, also for the debug toolbar). In the past we had some more problems with performance on VirtualBox and Vagrant. There are some great post out there on this subject (here and here) which we already applied to our setup. In a nutshell:

  • change logs and cache dir in AppKernel
  • use NFS share

The cause: JMSDiExtraBundle

After some profiling I discovered there were so many calls originating from JMSDiExtraBundle I tried to disable the bundle. And guess what: loading time dropped to some whopping 200ms!

The real problem was the way the bundle was configured:

This causes the bundle to search trough all your php files in those locations. Apparently in the old situation (php 5.3 and CentOS) this wasn’t as problematic as in the new situation (php-fpm 5.5, Debian).

Speed up your data migration with Spork

One of the blogs I like to keep an eye on is Kris Wallsmith his personal blog. He is a Symfony2 contributor and also author of Assetic and Buzz. Last year he wrote about a new experimental project called Spork: a wrapper around pcntl_fork to abstract away the complexities with spawning child processes with php. The article was very interesting, although I didn’t had any valid use case to try the library out. That was, until today.

It happens to be we were preparing a rather large data migration for a application with approximately 17,000 users. The legacy application stored the passwords in a unsafe way – plaintext – so we had to encrypt ’em al during the migration. Our weapon of choice was bcrypt, and using the BlowfishPasswordEncoderBundle implementing was made easy. Using bcrypt did introduce a new problem: encoding all these records would take a lot of time! That’s where Spork comes in!

Setting up the Symfony2 migration Command

If possible I wanted to fork between 8 and 15 processes to gain maximum speed. We’ll run the command on a VPS with 8 virtual cores so I want to stress the machine as much as possible ;). Unfortunately the example on GitHub as well on his blog didn’t function any more so I had to dig in just a little bit. I came up with this to get the forking working:

The command generates the following output:

Make it a little bit more dynamic

To be really useful I’ve added some parameters so we can control the behavior a little more. As I mentioned before I wanted to control the amount forks so I added a option to control this. This value needs to be passed on to the constructor of the ChunkStrategy:

I also added a max parameter so we can run some tests on a small set of users, instead of the whole database. When set I pass it on to the setMaxResults method of the $query object.

Storing the results in MySQL: Beware!

In Symfony2 projects storing and reading data from the database is pretty straight forward using Doctrine2. However when you start forking your PHP process keep in mind the following:

  1. all the forks share the same database connection;
  2. when the first fork exits, it will also close the database connection;
  3. database operations in running forks will yield: General error: 2006 MySQL server has gone away

This is a known problem. In order to fix this problem I create and close a new connection in each fork:

That’s basically it. Running this command on a VPS comparable with c1.xlarge Amazone EC2 server did speed up things a lot. So if you’re also working on a import job like this which can be split up in separate tasks you know what to do… Give Spork a try! It’s really easy, I promise.

UPDATE 2013-03-19
As stated in the comments by Kris, you should close the connection just before forking. Example of how to do this:

Symfony2 authentication provider: authenticate against webservice

The past few days I have really be struggeling with the Symfony2 security component. It is the most complex component of Symfony2 if you ask me! On the symfony.com website there is a pretty neat cookbook article about creating a custom authentication provider. Despite the fact that it covers the subject pretty well, it lacks support for form-based authentication use cases. In the current Symfony2 project I’m working on, we’re dealing with a web service that we need to authenticate against. So the cookbook article was nothing more then a good introduction unfortunately.

Using DaoAuthenticationProvider as example

Since we don’t want to reinvent the wheel, a good place to start is by investigating the providers that are in the Symfony2 core. The DaoAuthenticationProvider is a very good example, and used by the default form login. We are going to add a few pieces of code, so we can use the listener and configuration settings. The only thing we want to change are the authentication itself and the user provider. If you take a look at the link above, you will see the only thing we need to change is the checkAuthentication method. But, a few more steps are needed in order to make things function correctly. Let’s begin! 🙂

We also need a UserProvider!

First things first: we need a custom user provider. The task of the user provider is load the user from a source so the authentication process can continue. Because a user can already be registered at the webservice a traditional database user provider won’t work. We need to create a local record for every user that registers or logs in and doesn’t have an account. So basically the user provider is only responsible for loading and creating a user record. In this example I save the user immediately when there is no record; probably you want to do this after authenticating.

The code for the use provider looks like this:

We add it to our services configuration in app/config/services.yml:

Creating the AuthenticationProvider

As I said earlier we are going to base our provider on the DaoAuthenticationProvider. In my bundle I created a new class called ServiceAuthenticationProvider. Like our example we are extending the abstract UserAuthenticationProvider. Besides the checkAuthentication method we also must implement the retrieveUser method. We inject the service through the constructor, so the class looks like this:

Note the call to $this->service->authenticate where the magic happens. The retrieveUser method receives a User instance from our user provider. Although this is not really clear in the code above, it will be after configuration in the service container. We use the configuration from the Symfony core and adjust it to our needs:

Please note the empty arguments. Look a bit strange, huh? These will be magically filled when the container is build by our Factory! This is a bit tricky, and the cookbook explains pretty wel, so I suggest to take a look there. We are extending the FormLoginFactory because we want to change it bit:

Add the builder in the Acme\DemoBundle\AcmeDemoBundle.php file:

Finally, change your security config:

The webservice-login key activates our authentication provider. The user provider is defined under providers as acme_provider with the corresponding service id.
I used the AcmeDemo bundle from symfony-standard repository, so you could just copy paste most of my code to see everything in action! Only thing you need to provide yourself is a dummy webservice.

Happy coding!

How to use Symfony2 entities from a bundle in vendor/bundles

Today I was working on a bundle for generating invoices. Since we have some kind of invoice functionality in many projects in the past, my goal was to create a nice reusable bundle.
So I started with creating an empty bundle and moved it to /vendor/bundles/Netvlies/Bundle/InvoiceBundle.

With the app/console I started generating a entity for persisting particular data for an invoice:

This is still all pretty straight forward… so to complete this difficult task I tried to create to update the schema:

Hmmz, wtf? 🙁 For some obvious reason it was ignoring my new bundle outside the src structure? After some little investigation I discovered you have to add it to your ORM mapping like this: