Data annotations for handling empty strings

bitshift picture bitshift · Nov 21, 2019 · Viewed 7.1k times · Source

I use the EF Core Powertools to reverse-engineer a set of tables from my Db into an EF data model.

When creating a record in a table, one or more varchar fields are marked as allowing null.

If I update my entity attributes on these fields to use

[Required(AllowEmptyStrings = true)]

This will let empty strings go through, but on a query for a list of entities already in the table that include null values for one or more of these fields, won't that fail because of the attribute? E.g. it's not just empty, it's null.

So how can one modify the entity definition to both allow empty strings to be saved, yet also allow null values in a query result?

Answer

Gert Arnold picture Gert Arnold · Nov 24, 2019

tl;dr
The attribute doesn't help you in data validation. It even causes you trouble because it restricts your querying options.

First, contrary to EF6, EF core doesn't do any data validation. So your good intention to block null strings from entering a database field that is nullable can't be fulfilled by any attribute.

Worse, the attribute will cause you trouble. SQL query generation is affected by properties being required or not. Let me demonstrate that using this little class:

public class Product
{   
    public int ID { get; set; }

    [Required(AllowEmptyStrings = true)] // or false
    public string Name { get; set; }
}

This query...

string name = "a";
var products = db.Products
    .Where(x => x.Name == name)
    .ToList();

...with [Required] (AllowEmptyStrings false or true) generates this WHERE clause:

WHERE ([p].[Name] = @__name_0) AND @__name_0 IS NOT NULL

Witout the attribute nothing less than:

WHERE (([p].[Name] = @__name_0) AND ([p].[Name] IS NOT NULL AND @__name_0 IS NOT NULL))
    OR ([p].[Name] IS NULL AND @__name_0 IS NULL)

EF does this to get the same null semantics as in C#. In SQL, just [p].[Name] = @__name_0 is undetermined when @__name_0 is NULL and the query wouldn't return any records, not even the ones where Name is null. The same LINQ query in C# would. The extra null checks in the second WHERE clause account for the case where both [Name] and @__name_0 are null, which in C# would be considered equal. By the way, if desired, these null semantics can be replaced by database null semantics.

When the field is required, EF assumes the field value is never null and it omits this extra condition. That assumption is what causes you trouble. The query will never return records with null names not even when explicitly querying them. With string name = null; no records with null names will be returned.

But now for the total bummer, EF won't even allow you to query for null values explicitly by an additional condition:

var products = db.Products
    .Where(x => x.Name == name || x.Name == null)
    .ToList();

EF just ignores the name == null condition. It's so sure that the field can't contain null values that this query...

var products = db.Products
    .Where(x => x.Name == null)
    .ToList();

produces this SQL predicate:

WHERE CAST(0 AS bit) = CAST(1 AS bit)

Excuse me? Could I query for null values please? There you are, saddled with a legacy database, trying to do the best you can by making fields required from now on, but EF makes it practically impossible. This is also true in EF6.

Long story short: don't use the attribute. It only does harm and nothing good.

Alternatives

  • Do these validations yourself. This blog has a coule of suggestions.
  • If using ASP.Net, use DTO/view model objects with attributes to get early data entry validation. This can't replace all data validation on save though.
  • (Preferred) Repair the legacy data by converting null strings into something useful and do use the attribute because of its (now beneficial) effect on queries and its effect on model validation in other frameworks, like ASP.Net.