Josh
Josh

Reputation: 5026

Core Data equivalent of SQL DATEPART

Given a table of books published, with a date_published column of typeNSAttributeType.DateAttributeType, I would like to know how many books were published by year like this:

Year | Books
-----+------
2013 | 76
2014 | 172
2015 | 155

In plain old SQL this is simple (although it varies slightly by RDBMS):

SELECT DATEPART(yyyy, date_published) AS "Year", COUNT(*) AS "Books"
FROM books
GROUP BY DATEPART(yyyy, date_published)

I'm new to Swift and iOS in general but everything I looked at suggested either pre-computing the year and storing that, or loading all the data and counting it myself. Neither of these approaches suited me as the year is in fact an accounting year (that can vary after storage) and the amount of data is potentially large.

Most approaches revolved around adding a custom attribute to my NSManagedObject. That seems like it's too late to me because the object would not have been loaded into memory at this stage. There were also discussions around NSFetchedResultsController with sectionNameKeyPath's, but again this feels like it's too late in the fetch process. I found NSExpression convoluted so I may well have missed something but it seems like I can't invoke a custom Swift function here. Really, at the end of the day, I expected to find built-in functions for things like DATEPART, DATEADD, DATEDIFF, and I was hoping somebody could point me in the right direction.

As a more concrete example consider the UK tax year which runs 6 April to 5 April. To calculate the tax year I would subtract 3 months and 6 days (to midnight on 5 April). So for a book published on 1 March 2012 I would do the subtraction which would give me 24 November 2011, including 29 days in February for the leap year. From this I simply extract the year part, 2011. So the UK tax year for 1 March 2012 is 2011. I could precompute 2011 and store that in a new column. But then if I move from the UK to Australia the fiscal year changes to July through June. More likely I have a company with a different accounting period than the fiscal year (very likely in the UK). That company then gets taken over by a US group that uses the calendar year, and everyone is happy, except my little app that thinks March 2012 is in 2011.

Here's some boilerplate to get going... with no attempt to group by year:

// The raw date for grouping by - no attempt to extract year
let date = NSExpressionDescription()
date.expression = NSExpression(format: "date_published")
date.name = "date"
date.expressionResultType = .DateAttributeType

// The number of books
let books = NSExpressionDescription()
books.expression = NSExpression(format: "count:(publication_title)")
books.name = "books"
books.expressionResultType = .Integer32AttributeType

// Put a fetch together
let fetch = NSFetchRequest(entityName:"Book")
fetch.resultType = .DictionaryResultType
fetch.propertiesToFetch = [date, books]
fetch.propertiesToGroupBy = [date]

// Execute now
var error: NSError?
if let results = context.executeFetchRequest(fetch,
    error: &error) as Array<NSDictionary>? {

        for row in results {
            let date = row.valueForKey("date") as? NSDate
            let books = row.valueForKey("books") as? Int
            NSLog("%@ %d", date!, books!)
        }

} else {

    NSLog("Fail!")

}

Thanks for any pointers!

Upvotes: 4

Views: 515

Answers (1)

Tom Harrington
Tom Harrington

Reputation: 70966

As you're finding, this touches on something that's a weak spot in Core Data's API. It's common to explain that one shouldn't think of Core Data in terms of SQL because it uses a different approach. Dates are where this can be really annoying, because Core Data hides some SQLite capabilities. (They do this at least partly because Core Data isn't a SQLite wrapper, and can work with other, non-SQL storage systems).

The core problem is that Core Data's "Date" type corresponds to an NSDate, and NSDate in turn is just a floating-point number representing the number of seconds since a reference date. It doesn't include year, month, or day. Those values are not even fixed, because the instant in time represented by an NSDate might mean a different date in California as opposed to Japan, for example. The word "date" in these type names is unfortunately misleading.

This is why people generally recommend using extra fields, or at least different data types, for apps using Core Data that need to consider the actual date in some time zone as opposed to a precise moment in time regardless of zone. There isn't a good way to construct a Core Data query that operates on a "Date" field that does what you need. Dealing with this comes down to storing the data you actually need instead of something that just approximates what you need-- except that calling this type "Date" confuses the choice. You don't want a Core Data "Date" type here.

So let's consider one approach to getting the result you need while making SQLite do as much of the work as possible. Let's suppose you replace your date field with an integerDate field that represents the date as an integer (Core Data "Integer 64") using the format yyyyMMDD. Today would be stored as 20151223. In theory this could be done in one step with some NSExpression wizardry, but Core Data doesn't let you group by expressions, so that's out.

Step 1: Get all distinct year values

NSExpression *yearExpression = [NSExpression expressionWithFormat:@"divide:by:(%K,10000)", @"integerDate"];
NSExpressionDescription *yearExpDescription = [[NSExpressionDescription alloc] init];
yearExpDescription.name = @"year";
yearExpDescription.expression = yearExpression;
yearExpDescription.expressionResultType = NSInteger64AttributeType;

NSFetchRequest *distinctYearsRequest = [NSFetchRequest fetchRequestWithEntityName:@"Event"];
distinctYearsRequest.resultType = NSDictionaryResultType;
distinctYearsRequest.returnsDistinctResults = YES;
distinctYearsRequest.propertiesToFetch = @[ yearExpDescription ];

NSError *fetchError = nil;
NSArray *distinctYearsResult = [self.managedObjectContext executeFetchRequest:distinctYearsRequest error:&fetchError];
if (distinctYearsResult != nil) {
    NSLog(@"Results by year: %@", distinctYearsResult);
}
NSArray *distinctYears = [distinctYearsResult valueForKey:@"year"];

In the above, yearExpression gets the year portion of integerDate by simple division. When the above completes, distinctYears contains all the years represented by integerDate.

Step 2: Loop through years, getting a count for each:

NSMutableDictionary *countByYear = [NSMutableDictionary dictionary];

for (NSNumber *year in distinctYears) {
    NSFetchRequest *countForYearFetch = [NSFetchRequest fetchRequestWithEntityName:@"Event"];
    countForYearFetch.resultType = NSDictionaryResultType;
    countForYearFetch.propertiesToFetch = @[ yearExpDescription ];

    NSExpression *targetYearExpression = [NSExpression expressionForConstantValue:year];
    NSPredicate *yearPredicate = [NSComparisonPredicate predicateWithLeftExpression:yearExpression rightExpression:targetYearExpression modifier:NSDirectPredicateModifier type:NSEqualToPredicateOperatorType options:0];
    countForYearFetch.predicate = yearPredicate;

    NSError *fetchError = nil;
    NSUInteger countForYear = [self.managedObjectContext countForFetchRequest:countForYearFetch error:&fetchError];
    countByYear[year] = @(countForYear);
}

NSLog(@"Results by year: %@", countByYear);

This does a separate fetch for each year, but keeps the memory overhead low by only fetching the count of results instead of the actual data. When this finishes, countByYear has the number of entries by year, based on the integerDate field.

Having said all this, keep in mind that you do have the option of using SQLite directly instead of using Core Data. PLDatabase will give you an Objective-C style wrapper while still allowing raw SQL queries for everything SQLite can do.

Upvotes: 5

Related Questions