How can I get TStringList to sort differently in Delphi

lkessler picture lkessler · Feb 1, 2010 · Viewed 16.1k times · Source

I have a simple TStringList. I do a TStringList.Sort on it.

Then I notice that the underscore "_" sorts before the capital letter "A". This was in contrast to a third party package that was sorting the same text and sorted _ after A.

According to the ANSI character set, A-Z are characters 65 - 90 and _ is 95. So it looks like the 3rd party package is using that order and TStringList.Sort isn't.

I drilled down into guts of TStringList.Sort and it is sorting using AnsiCompareStr (Case Sensitive) or AnsiCompareText (Case Insensitive). I tried it both ways, setting my StringList's CaseSensitive value to true and then false. But in both cases, the "_" sorts first.

I just can't imagine that this is a bug in TStringList. So there must be something else here that I am not seeing. What might that be?

What I really need to know is how can I get my TStringList to sort so that it is in the same order as the other package.

For reference, I am using Delphi 2009 and I'm using Unicode strings in my program.


So the final answer here is to override the Ansi compares with whatever you want (e.g. non-ansi compares) as follows:

type
  TMyStringList = class(TStringList)
  protected
    function CompareStrings(const S1, S2: string): Integer; override;
  end;

function TMyStringList.CompareStrings(const S1, S2: string): Integer;
begin
  if CaseSensitive then
    Result := CompareStr(S1, S2)
  else
    Result := CompareText(S1, S2);
end;

Answer

Jeroen Wiert Pluimers picture Jeroen Wiert Pluimers · Feb 1, 2010

Define "correctly".
i18n sorting totally depends on your locale.
So I totally agree with PA that this is not a bug: the default Sort behaviour works as designed to allow i18n to work properly.

Like Gerry mentions, TStringList.Sort uses AnsiCompareStr and AnsiCompareText (I'll explain in a few lines how it does that).

But: TStringList is flexible, it contains Sort, CustomSort and CompareStrings, which all are virtual (so you can override them in a descendant class)
Furthermore, when you call CustomSort, you can plug in your own Compare function.

At the of this answer is a Compare function that does what you want:

  • Case Sensitive
  • Not using any locale
  • Just compare the ordinal value of the characters of the strings

CustomSort is defined as this:

procedure TStringList.CustomSort(Compare: TStringListSortCompare);
begin
  if not Sorted and (FCount > 1) then
  begin
    Changing;
    QuickSort(0, FCount - 1, Compare);
    Changed;
  end;
end;

By default, the Sort method has a very simple implementation, passing a default Compare function called StringListCompareStrings:

procedure TStringList.Sort;
begin
  CustomSort(StringListCompareStrings);
end;

So, if you define your own TStringListSortCompare compatible Compare method, then you can define your own sorting.
TStringListSortCompare is defined as a global function taking the TStringList and two indexes referring the items you want to compare:

type
  TStringListSortCompare = function(List: TStringList; Index1, Index2: Integer): Integer;

You can use the StringListCompareStrings as a guideline for implementing your own:

function StringListCompareStrings(List: TStringList; Index1, Index2: Integer): Integer;
begin
  Result := List.CompareStrings(List.FList^[Index1].FString,
                                List.FList^[Index2].FString);
end;

So, by default TStringList.Sort defers to TList.CompareStrings:

function TStringList.CompareStrings(const S1, S2: string): Integer;
begin
  if CaseSensitive then
    Result := AnsiCompareStr(S1, S2)
  else
    Result := AnsiCompareText(S1, S2);
end;

Which then use the under lying Windows API function CompareString with the default user locale LOCALE_USER_DEFAULT:

function AnsiCompareStr(const S1, S2: string): Integer;
begin
  Result := CompareString(LOCALE_USER_DEFAULT, 0, PChar(S1), Length(S1),
    PChar(S2), Length(S2)) - 2;
end;

function AnsiCompareText(const S1, S2: string): Integer;
begin
  Result := CompareString(LOCALE_USER_DEFAULT, NORM_IGNORECASE, PChar(S1),
    Length(S1), PChar(S2), Length(S2)) - 2;
end;

Finally the Compare function you need. Again the limitations:

  • Case Sensitive
  • Not using any locale
  • Just compare the ordinal value of the characters of the strings

This is the code:

function StringListCompareStringsByOrdinalCharacterValue(List: TStringList; Index1, Index2: Integer): Integer;
var
  First: string;
  Second: string;
begin
  First := List[Index1];
  Second := List[Index2];
  if List.CaseSensitive then
    Result := CompareStr(First, Second)
  else
    Result := CompareText(First, Second);
end;

Delphi ain't closed, quite the opposite: often it is a really flexible architecture.
It is often just a bit of digging to see where you can hook into the that flexibility.

--jeroen