CakePHP 2.1 - How to properly use DISTINCT in find()

alieninlondon picture alieninlondon · Jun 23, 2012 · Viewed 28.1k times · Source

I have a question which is driving me crazy and I have to admit I am not that experienced in CakePHP. As mentioned in this question, Using DISTINCT in a CakePHP find function, using DISTINCT this way:

$this->Model->find('all', array('fields'=>'DISTINCT field_name'));

does not return DISTINCT values, instead it returns all rows. In fact, the DISTINCT here is completely pointless because, for some reason , CakePHP adds TableName.idin the SQL query (why?? can I remove the id reference??), effectively returning every DISTINCT primary key (=all rows=unhelpful).

So, I still want to return the DISTINCT values of a particular field_name column. Can I not do it using just the find('all') or find('list') function? Is it really that the proper way of doing it using this Set::extract() function described in the link above? That appears to be a overly indirect solution by CakePHP, normally Cake make my life easier. :-) What is the proper way of using find and DISTINCT together? Maybe DISTINCT doesn't work for find()?

Looking at the CookBook, they say: "A quick example of doing a DISTINCT query. You can use other operators, such as MIN(), MAX(), etc., in a similar fashion:"

<?php
    array(
        'fields' => array('DISTINCT (User.name) AS my_column_name'),
        'order' = >array('User.id DESC')
    )
?>

Source: http://book.cakephp.org/2.0/en/models/retrieving-your-data.html

This indicates that DISTINCT should be possible to use, but what is what here? Does (User.name) correspond to the field_name I want DISTINCT for or is my_column_name my field_name?

Finally, has any of this changed when migrating from CakePHP 1.x to CakePHP 2.x? Ie are the answers for CakePHP 1.x seen on Stackoverflow still relevant?

Thanks in advance!

Answer

dhofstet picture dhofstet · Jun 24, 2012

Yes, the second snippet is the correct way to do a SELECT DISTINCT in CakePHP 2.x. User.name corresponds to the field name, in this case to the field name in the users table. my_column_name is an (optional) alias for the field name in the result set, i.e. instead of name the field will be named my_column_name in the result set.