Databases and the doctrine orm


DQL GROUP BY syntax:

DQL HAVING syntax:

and clauses can be used for dealing with aggregate functions. The Following aggregate functions are available on DQL: , , , ,

  • Selecting alphabetically first user by name:: $q = Doctrine_Query::create() ->select(‘MIN(a.amount)’) ->from(‘Account a’); echo $q->getSqlQuery(); The above call to getSqlQuery() would output the following SQL query: .. code-block:: sql SELECT MIN(a.amount) AS a_0 FROM account a
  • Selecting the sum of all Account amounts:: $q = Doctrine_Query::create() ->select(‘SUM(a.amount)’) ->from(‘Account a’); echo $q->getSqlQuery(); The above call to getSqlQuery() would output the following SQL query: .. code-block:: sql SELECT SUM(a.amount) AS a_0 FROM account a
  • Using an aggregate function in a statement containing no clause, results in grouping on all rows. In the example below we fetch all users and the number of phonenumbers they have. :: $q = Doctrine_Query::create() ->select(‘u.username’) ->addSelect(‘COUNT( as num_phonenumbers’) ->from(‘User u’) ->leftJoin(‘u.Phonenumbers p’) ->groupBy(‘’); echo $q->getSqlQuery(); The above call to getSqlQuery() would output the following SQL query: .. code-block:: sql SELECT AS u_id, u.username AS u_username, COUNT( AS p_0 FROM user u LEFT JOIN phonenumber p ON = p.user_id GROUP BY
  • The clause can be used for narrowing the results using aggregate values. In the following example we fetch all users which have at least 2 phonenumbers:: $q = Doctrine_Query::create() ->select(‘u.username’) ->addSelect(‘COUNT( as num_phonenumbers’) ->from(‘User u’) ->leftJoin(‘u.Phonenumbers p’) ->groupBy(‘’) ->having(‘num_phonenumbers >= 2’); echo $q->getSqlQuery(); The above call to getSqlQuery() would output the following SQL query: .. code-block:: sql SELECT AS u_id, u.username AS u_username, COUNT( AS p_0 FROM user u LEFT JOIN phonenumber p ON = p.user_id GROUP BY HAVING p_0 >= 2 You can access the number of phonenumbers with the following code:: $users = $q->execute(); foreach($users as $user) { echo $user->name . ‘ has ‘ . $user->num_phonenumbers . ‘ phonenumbers’; }


Most users at one time or another have dealt with hierarchical data in a SQL database and no doubt learned that the management of hierarchical data is not what a relational database is intended for. The tables of a relational database are not hierarchical (like XML), but are simply a flat list. Hierarchical data has a parent-child relationship that is not naturally represented in a relational database table.

For our purposes, hierarchical data is a collection of data where each item has a single parent and zero or more children (with the exception of the root item, which has no parent). Hierarchical data can be found in a variety of database applications, including forum and mailing list threads, business organization charts, content management categories, and product categories.

In a hierarchical data model, data is organized into a tree-like structure. The tree structure allows repeating information using parent/child relationships. For an explanation of the tree data structure, see here.

There are three major approaches to managing tree structures in relational databases, these are:

  • the adjacency list model
  • the nested set model (otherwise known as the modified pre-order tree traversal algorithm)
  • materialized path model

These are explained in more detail at the following links:


Binding Types

Doctrine DBAL extends PDOs handling of binding types in prepared statements considerably. Besides constants, you can make use of two very powerful additional features.

If you don’t specify an integer (through one of constants) to any of the parameter binding methods but a string, Doctrine DBAL will ask the type abstraction layer to convert the passed value from its PHP to a database representation. This way you can pass instances to a prepared statement and have Doctrine convert them to the appropriate vendors database format:

If you take a look at you will see that parts of the conversion are delegated to a method on the current database platform, which means this code works independent of the database you are using.

Be aware this type conversion only works with , and . It is not supported to pass a doctrine type name to , because this would not work with binding by reference.

Further Reading

For people new to object-relational mapping and (object-oriented) domain models we recommend the following literature:

The books by Martin Fowler cover a lot of the basic ORM terminology, the different approaches of modeling business logic and the patterns involved.

Another good read is about . Though serious Domain-Driven Design is currently not possible with Doctrine, this is an excellent resource for good domain modeling, especially in complex business domains, and the terminology around domain models that is pretty widespread nowadays is explained in depth (Entities, Value Objects, Repositories, etc).

Removing entities

An entity can be removed from persistent storage by passing it to the method. By applying the operation on some entity, that entity becomes REMOVED, which means that its persistent state will be deleted once is invoked.

Just like , invoking on an entity does NOT cause an immediate SQL DELETE to be issued on the database. The entity will be deleted on the next invocation of that involves that entity. This means that entities scheduled for removal can still be queried for and appear in query and collection results. See the section on for more information.


The semantics of the remove operation, applied to an entity X are as follows:

  • If X is a new entity, it is ignored by the remove operation. However, the remove operation is cascaded to entities referenced by X, if the relationship from X to these other entities is mapped with cascade=REMOVE or cascade=ALL (see «»).
  • If X is a managed entity, the remove operation causes it to become removed. The remove operation is cascaded to entities referenced by X, if the relationships from X to these other entities is mapped with cascade=REMOVE or cascade=ALL (see «»).
  • If X is a detached entity, an InvalidArgumentException will be thrown.
  • If X is a removed entity, it is ignored by the remove operation.
  • A removed entity X will be removed from the database as a result of the flush operation.

After an entity has been removed its in-memory state is the same as before the removal, except for generated identifiers.

Removing an entity will also automatically delete any existing records in many-to-many join tables that link this entity. The action taken depends on the value of the mapping attribute onDelete. Either Doctrine issues a dedicated statement for records of each join table or it depends on the foreign key semantics of onDelete=CASCADE.

Deleting an object with all its associated objects can be achieved in multiple ways with very different performance impacts.

  1. If an association is marked as Doctrine 2 will fetch this association. If its a Single association it will pass this entity to . If the association is a collection, Doctrine will loop over all its elements and pass them to. In both cases the cascade remove semantics are applied recursively. For large object graphs this removal strategy can be very costly.
  2. Using a DQL statement allows you to delete multiple entities of a type with a single command and without hydrating these entities. This can be very efficient to delete large object graphs from the database.
  3. Using foreign key semantics can force the database to remove all associated objects internally. This strategy is a bit tricky to get right but can be very powerful and fast. You should be aware however that using strategy 1 () completely by-passes any foreign key option, because Doctrine will fetch and remove all associated entities explicitly nevertheless.

Writing Migration Classes

Migration classes consist of a simple class that extends from Doctrine_Migration`. You can define a and «down()` method that is meant for doing and undoing changes to a database for that migration version.

The name of the class can be whatever you want, but the name of the file which contains the class must have a prefix containing a number that is used for loading the migrations in the correct order.

Below are a few examples of some migration classes you can use to build your database starting from version 1.

For the first version lets create a new table named :

Now lets create a second version where we add a new column to the table we added in the previous version:

Finally, lets change the type of the column in the table we created previously:

Now that we have created the three migration classes above we can run our script we implemented earlier:

If you look in the database you will see that we have the table named created and the version number in the is set to three.

If you want to migrate back to where we started you can pass a version number to the method in the script:

Now run the script:

If you look in the database now, everything we did in the methods has been reversed by the contents of the method.

Here is a list of the available methods you can use to alter your database in your migration classes.

You might notice already that the data structures used to manipulate the your schema are the same as the data structures used with the database abstraction layer. This is because internally the migration package uses the database abstraction layer to perform the operations specified in the migration classes.

Now the oppositewould look like the following:

The valid options for the are:

Name Description
name Optional constraint name
local The local field(s)
foreign The foreign reference field(s)
foreignTable The name of the foreign table
onDelete Referential delete action
onUpdate Referential update action
deferred Deferred constraint checking

Some DBMS like sqlite do not implement the rename column operation. An exception is thrown if you try and rename a column when using a sqlite connection.

Change any aspect of an existing column:

Sometimes you may perform some operations in the method that cannot be reversed. For example if you remove a column from a table. In this case you need to throw a new Doctrine_Migration_IrreversibleMigrationException exception.

Sometimes you may need to alter the data in the database with your models. Since you may create a table or make a change, you have to do the data altering after the or method is processed. We have hooks in place for this named , , , and . Define these methods and they will be triggered:

The above example assumes you have created and made available the model. Once the method is executed the table is created so the model can be used. We have provided the definition of this model below.

Here is the same example in YAML format. You can read more about YAML in the YAML Schema Files chapter:

In Doctrine migrations it is possible most of the time to automate the opposite of a migration method. For example if you create a new column in the up of a migration, we should be able to easily automate the down since all we need to do is remove the column that was created. This is possible by using the function for both the and .

The method accepts an argument of $direction and it will either have a value of or . This value is passed to the first argument of functions like , , etc.

Here is an example where we automate the adding and removing of a column

Now when we run up with the above migration, the column will be added and when we run down the column will be removed.

Here is a list of the following migration methods that can be automated:

Automate Method Name Automates
table() createTable()/dropTable()
constraint() createConstraint()/dropConstraint()
foreignKey() createForeignKey()/dropForeignKey()
column() addColumn()/removeColumn()
index() addIndex()/removeIndex()

Doctrine offers the ability to generate migration classes a few different ways. You can generate a set of migrations to re-create an existing database, or generate migration classes to create a database for an existing set of models. You can even generate migrations from differences between two sets of schema information.

To generate a set of migrations from the existing database connections it is simple, just use .

To generate a set of migrations from an existing set of models it is just as simple as from a database, just use .


Doctrine offers the ability to create and drop your databases from your defined Doctrine connections. The only trick to using it is that the name of your Doctrine connection must be the name of your database. This is required due to the fact that PDO does not offer a method for retrieving the name of the database you are connected to. So in order to create and drop the database Doctrine itself must be aware of the name of the database.

Doctrine offers static convenience methods available in the main Doctrine class. These methods perform some of the most used functionality of Doctrine with one method. Most of these methods are using in the Doctrine_Task system. These tasks are also what are executed from the Doctrine_Cli.

Index Building

Whenever a searchable record is being inserted into database Doctrine executes the index building procedure. This happens in the background as the procedure is being invoked by the search listener. The phases of this procedure are:

  1. Analyze the text using a Doctrine_Search_Analyzer based class
  2. Insert new rows into index table for all analyzed keywords

Sometimes you may not want to update the index table directly when new searchable entries are added. Rather you may want to batch update the index table in certain intervals. For disabling the direct update functionality you’ll need to set the batchUpdates option to true when you attach the behavior:

Here is the same example in YAML format. You can read more about YAML in the YAML Schema Files chapter:

The actual batch updating procedure can be invoked with the method. It takes two optional arguments: and . Limit can be used for limiting the number of batch indexed entries while the offset can be used for setting the first entry to start the indexing from.

First lets insert a new records:

If you don’t have batch updates enabled then the index will be automatically updated for you when you insert or update records. If you do have batch updates enabled then you can perform the batch updates by using the following code:

Transaction Demarcation

Transaction demarcation is the task of defining your transaction boundaries. Proper transaction demarcation is very important because if not done properly it can negatively affect the performance of your application. Many databases and database abstraction layers like PDO by default operate in auto-commit mode, which means that every single SQL statement is wrapped in a small transaction. Without any explicit transaction demarcation from your side, this quickly results in poor performance because transactions are not cheap.

For the most part, Doctrine 2 already takes care of proper transaction demarcation for you: All the write operations (INSERT/UPDATE/DELETE) are queued until is invoked which wraps all of these changes in a single transaction.

However, Doctrine 2 also allows (and encourages) you to take over and control transaction demarcation yourself.

These are two ways to deal with transactions when using the Doctrine ORM and are now described in more detail.

The first approach is to use the implicit transaction handling provided by the Doctrine ORM EntityManager. Given the following code snippet, without any explicit transaction demarcation:

Since we do not do any custom transaction demarcation in the above code, will begin and commit/rollback a transaction. This behavior is made possible by the aggregation of the DML operations by the Doctrine ORM and is sufficient if all the data manipulation that is part of a unit of work happens through the domain model and thus the ORM.

The explicit alternative is to use the API directly to control the transaction boundaries. The code then looks like this:

Explicit transaction demarcation is required when you want to include custom DBAL operations in a unit of work or when you want to make use of some methods of the API that require an active transaction. Such methods will throw a to inform you of that requirement.

A more convenient alternative for explicit transaction demarcation is the use of provided control abstractions in the form of and . When used, these control abstractions ensure that you never forget to rollback the transaction, in addition to the obvious code reduction. An example that is functionally equivalent to the previously shown code looks as follows:

For historical reasons, will return whenever the return value of is loosely false. Some examples of this include , , , , and .

The difference between and is that the latter abstraction flushes the prior to transaction commit and rolls back the transaction when an exception occurs.

Defining Identity and Generator Strategies

An entity has to have at least one element. For composite keys you can specify more than one id-element, however surrogate keys are recommended for use with Doctrine 2. The Id field allows to define properties of the identifier and allows a subset of the element attributes:

Required attributes:

  • name — The name of the Property/Field on the given Entity PHP class.
  • type — The name, preferably «string» or «integer».

Optional attributes:

column — Name of the column in the database, defaults to the field name.

Using the simplified definition above Doctrine will use no identifier strategy for this entity. That means you have to manually set the identifier before calling . This is the so called strategy.

If you want to switch the identifier generation strategy you have to nest a element inside the id-element. This of course only works for surrogate keys. For composite keys you always have to use the strategy.

The following values are allowed for the strategy attribute:

  • AUTO — Automatic detection of the identifier strategy based on the preferred solution of the database vendor.
  • IDENTITY — Use of a IDENTIFY strategy such as Auto-Increment IDs available to Doctrine AFTER the INSERT statement has been executed.
  • SEQUENCE — Use of a database sequence to retrieve the entity-ids. This is possible before the INSERT statement is executed.

If you are using the SEQUENCE strategy you can define an additional element to describe the sequence:

Required attributes for :

sequence-name — The name of the sequence

Optional attributes for :

Functional Expressions

The function returns a string that is a concatenation of its arguments. In the example above we map the concatenation of users and to a value called .

The above call to getSqlQuery() would output the following SQL query:

Now we can execute the query and get the mapped function value:

The second and third arguments of the function denote the starting position and length of the substring to be returned. These arguments are integers. The first position of a string is denoted by 1. The function returns a string.

The above call to getSqlQuery() would output the following SQL query:

Notice how the SQL is generated with the proper syntax for the DBMS you are using!

The function trims the specified character from a string. If the character to be trimmed is not specified, it is assumed to be space (or blank). The optional trim_character is a single-character string literal or a character-valued input parameter (i.e., char or Character). If a trim specification is not provided, BOTH is assumed. The function returns the trimmed string.

The above call to getSqlQuery() would output the following SQL query:

The and functions convert a string to lower and upper case, respectively. They return a string.

The above call to getSqlQuery() would output the following SQL query:

The function returns the position of a given string within a string, starting the search at a specified position. It returns the first position at which the string was found as an integer. The first argument is the string to be located; the second argument is the string to be searched; the optional third argument is an integer that represents the string position at which the search is started (by default, the beginning of the string to be searched). The first position in a string is denoted by 1. If the string is not found, 0 is returned.

The function returns the length of the string in characters as an integer.

Dealing with Relations

Accessing related records in Doctrine is easy: you can use exactly the same getters and setters as for the record properties.

You can use any of the three ways above, however the last one is the recommended one for array portability purposes.

When accessing a one-to-one related record that doesn’t exist, Doctrine automatically creates the object. That is why the above code is possible.

When accessing one-to-many related records, Doctrine creates a Doctrine_Collection for the related component. Lets say we have and and their relation is one-to-many. You can add easily as shown above:

Now we can easily save the user and the associated phonenumbers:

Another way to easily create a link between two related components is by using Doctrine_Record::link. It often happens that you have two existing records that you would like to relate (or link) to one another. In this case, if there is a relation defined between the involved record classes, you only need the identifiers of the related record(s):

Lets create a few new objects and keep track of the new phone number identifiers:

Let’s link the phone numbers to the user, since the relation to exists for the record:

If a relation to the record class is defined for the record class, you may even do this:

First create a user to work with:

Now create a new instance:

Now we can link the to our :

We can create another phone number:

Let’s link this to our too:

You can retrieve related records by the very same Doctrine_Record methods as in the previous subchapter. Please note that whenever you access a related component that isn’t already loaded Doctrine uses one statement for the fetching, hence the following example executes three .

dakmdaklsd alksdlkamsd kldlksmdklm

Much more efficient way of doing this is using DQL. The following example uses only one SQL query for the retrieval of related components.

You can update the related records by calling save for each related object / collection individually or by calling save on the object that owns the other objects. You can also call which saves all pending objects.

You can clear a related records references from an object. This does not change the fact that these objects are related and won’t change it in the database if you save. It just simply clears the reference in PHP of one object to another.

You can clear all references by doing the following:

Or you can clear a specific relationship:

This is useful if you were to do something like the following:

We can simplify the above scenario even further by using the relatedExists method. This is so that you can do the above check with less code and not have to worry about clearing the unnecessary reference afterwards.

You can delete related records individually be calling delete on a record or on a collection.

Here you can delete an individual related record:

You can delete an individual record from within a collection of records:

You could delete the entire collection if you wanted:

Or can just delete the entire user and all related objects:

Usually in a typical web application the primary keys of the related objects that are to be deleted come from a form. In this case the most efficient way of deleting the related records is using DQL DELETE statement. Lets say we have once again and with their relation being one-to-many. Deleting the given for given user id can be achieved as follows:

Sometimes you may not want to delete the records but to simply unlink the relations by setting the foreign key fields to null. This can of course be achieved with DQL but perhaps to most elegant way of doing this is by using Doctrine_Record::unlink.

Please note that the unlink method is very smart. It not only sets the foreign fields for related to null but it also removes all given references from the object.

Lets say we have a who has three (with identifiers 1, 2 and 3). Now unlinking the 1 and 3 can be achieved as easily as:


A database transaction is a unit of interaction with a database management system or similar system that is treated in a coherent and reliable way independent of other transactions that must be either entirely completed or aborted. Ideally, a database system will guarantee all of the ACID (Atomicity, Consistency, Isolation, and Durability) properties for each transaction.

  • Atomicity refers to the ability of the DBMS to guarantee that either all of the tasks of a transaction are performed or none of them are. The transfer of funds can be completed or it can fail for a multitude of reasons, but atomicity guarantees that one account won’t be debited if the other is not credited as well.
  • Consistency refers to the database being in a legal state when the transaction begins and when it ends. This means that a transaction can’t break the rules, or //integrity constraints//, of the database. If an integrity constraint states that all accounts must have a positive balance, then any transaction violating this rule will be aborted.
  • Isolation refers to the ability of the application to make operations in a transaction appear isolated from all other operations. This means that no operation outside the transaction can ever see the data in an intermediate state; a bank manager can see the transferred funds on one account or the other, but never on both

    even if she ran her query while the transfer was still being processed. More formally, isolation means the transaction history (or schedule) is serializable. For performance reasons, this ability is the most often relaxed constraint. See the isolation article for more details.

  • Durability refers to the guarantee that once the user has been notified of success, the transaction will persist, and not be undone. This means it will survive system failure, and that the database system has checked the integrity constraints and won’t need to abort the transaction. Typically, all transactions are written into a log that can be played back to recreate the system to its state right before the failure. A transaction can only be deemed committed after it is safely in the log.

In Doctrine all operations are wrapped in transactions by default. There are some things that should be noticed about how Doctrine works internally:

  • Doctrine uses application level transaction nesting.
  • Doctrine always executes / / queries at the end of transaction (when the outermost commit is called). The operations are performed in the following order: all inserts, all updates and last all deletes. Doctrine knows how to optimize the deletes so that delete operations of the same component are gathered in one query.

First we need to begin a new transation:

Next perform some operations which result in queries being executed:

Now we can commit all the queries by using the method:


Currently it is possible to install Doctrine four different ways that are listed below:

  • SVN (subversion)
  • SVN externals
  • PEAR Installer
  • Download PEAR Package

It is recommended to download Doctrine via SVN (subversion), because in this case updating is easy. If your project is already under version control with SVN, you should choose SVN externals.

If you wish to just try out Doctrine in under 5 minutes, the sandbox package is recommended. We will discuss the sandbox package in the next section.

Doctrine also provides a special package which is a zero configuration Doctrine implementation for you to test Doctrine without writing one line of code. You can download it from the download page.

The sandbox implementation is not a recommend implementation for a production application. It’s only purpose is for exploring Doctrine and running small tests.

Key Concepts

The Doctrine Query Language (DQL) is an object query language. It let’s you express queries for single objects or full object graphs, using the terminology of your domain model: class names, field names, relations between classes, etc. This is a powerful tool for retrieving or even manipulating objects without breaking the separation of the domain model (field names, class names, etc) from the relational model (table names, column names, etc). DQL looks very much like SQL and this is intended because it makes it relatively easy to grasp for people knowing SQL. There are, however, a few very important differences you should always keep in mind:

Take this example DQL query:

The things to notice about this query:

  • We select from classes and not tables. We are selecting from the User class/model.
  • We join along associations (u.Phonenumbers)
  • We can reference fields (u.level)
  • There is no join condition (ON x.y = y.x). The associations between your classes and how these are expressed in the database are known to Doctrine (You need to make this mapping known to Doctrine, of course. How to do that is explained later in the Defining Models chapter.).

DQL expresses a query in the terms of your domain model (your classes, the attributes they have, the relations they have to other classes, etc.).

It’s very important that we speak about classes, fields and associations between classes here. User is not a table / table name . It may be that the name of the database table that the User class is mapped to is indeed named User but you should nevertheless adhere to this differentiation of terminology. This may sound nit picky since, due to the ActiveRecord approach, your relational model is often very similar to your domain model but it’s really important. The column names are rarely the same as the field names and as soon as inheritance is involved, the relational model starts to diverge from the domain model. You can have a class User that is in fact mapped to several tables in the database. At this point it should be clear that talking about selecting from the User table is simply wrong then. And as Doctrine development continues there will be more features available that allow the two models to diverge even more.

Text Analyzers

By default Doctrine uses Doctrine_Search_Analyzer_Standard for analyzing the text. This class performs the following things:

  • Strips out stop-keywords (such as ‘and’, ‘if’ etc.) As many commonly used words such as ‘and’, ‘if’ etc. have no relevance for the search, they are being stripped out in order to keep the index size reasonable.
  • Makes all keywords lowercased. When searching words ‘database’ and ‘DataBase’ are considered equal by the standard analyzer, hence the standard analyzer lowercases all keywords.
  • Replaces all non alpha-numeric marks with whitespace. In normal text many keywords might contain non alpha-numeric chars after them, for example ‘database.’. The standard analyzer strips these out so that ‘database’ matches ‘database.’.
  • Replaces all quotation marks with empty strings so that «O’Connor» matches «oconnor»

You can write your own analyzer class by making a class that implements Doctrine_Search_Analyzer_Interface. Here is an example where we create an analyzer named :

The search analyzers must only contain one method named and it should return the modified inputted text to be indexed.

This analyzer can then be applied to the search object as follows:

Entity Object Graph Traversal

Although Doctrine allows for a complete separation of your domain model (Entity classes) there will never be a situation where objects are missing when traversing associations. You can walk all the associations inside your entity models as deep as you want.

Take the following example of a single entity fetched from newly opened EntityManager.

This code only retrieves the instance with id 1 executing a single SELECT statement against the articles table in the database. You can still access the associated properties author and comments and the associated objects they contain.

This works by utilizing the lazy loading pattern. Instead of passing you back a real Author instance and a collection of comments Doctrine will create proxy instances for you. Only if you access these proxies for the first time they will go through the EntityManager and load their state from the database.

This lazy-loading process happens behind the scenes, hidden from your code. See the following code:

A slice of the generated proxy classes code looks like the following piece of code. A real proxy class override ALL public methods along the lines of the method shown below:

С этим читают