Reputation: 67
I am looking to extend Silverstripe's CSVBulkLoader class to do some business logic before/upon import.
In the WineAdmin Class (extending ModelAdmin), I have a custom loader defined with $model_importers property:
//WineAdmin.php
private static $model_importers = [
'Wine' => 'WineCsvBulkLoader'
];
In the WineCsvBulkLoader Class, the $columnMap property maps CSV columns to SS DataObject columns:
//WineCsvBulkLoader.php
use SilverStripe\Dev\CsvBulkLoader;
class WineCsvBulkLoader extends CsvBulkLoader
{
public $columnMap = [
// csv columns // SS DO columns
'Item Number' => 'ItemNumber',
'COUNTRY' => 'Country',
'Producer' => 'Producer',
'BrandName' => 'BrandName',
// etc
];
Additionally, the $duplicateChecks property is set to look for duplicates.
public $duplicateChecks = [
'ItemNumber' => 'ItemNumber'
];
}
In the docs, I found some code for an example method that splits data in a column into 2 parts and maps those parts to separate columns on the class:
public static function importFirstAndLastName(&$obj, $val, $record)
{
$parts = explode(' ', $val);
if(count($parts) != 2) return false;
$obj->FirstName = $parts[0];
$obj->LastName = $parts[1];
}
Here are some additional enhancements I hope to make:
I'm not looking for a complete answer, but appreciative of any insights.
Upvotes: 1
Views: 517
Reputation: 24406
I'll attempt to answer some of your questions based on SilverStripe 4.2.0:
Judging by the logic in CsvBulkLoader::findExistingObject
the duplicateChecks property is used to help finding an existing record in order to update it (rather than create it). It will use defined values in the array in order to find the first record that matches a given value and return it.
What does the
$duplicateChecks
property actually do when there is a duplicate? Does it skip the record?
Nothing, it will just return the first record it finds.
Can I use callbacks here?
Kind of. You can use a method on the instance of CsvBulkLoader, but you can't pass it a callback directly (e.g. from _config.php etc). Example:
public $duplicateChecks = [
'YourFieldName' => [
'callback' => 'loadRecordByMyFieldName'
]
];
/**
* Take responsibility for loading a record based on "MyFieldName" property
* given the CSV value for "MyFieldName" and the original array record for the row
*
* @return DataObject|false
*/
public function loadRecordByMyFieldName($inputFieldName, array $record)
{
// ....
Note: duplicateChecks callbacks are not currently covered by unit tests. There's a todo in CsvBulkLoaderTest to add them.
Is
$obj
the final import object? How does it get processed?
You can see where these magic-ish methods get called in CsvBulkLoader::processRecord
:
if ($mapped && strpos($this->columnMap[$fieldName], '->') === 0) {
$funcName = substr($this->columnMap[$fieldName], 2);
$this->$funcName($obj, $val, $record); // <-------- here: option 1
} elseif ($obj->hasMethod("import{$fieldName}")) {
$obj->{"import{$fieldName}"}($val, $record); // <----- here: option 2
} else {
$obj->update(array($fieldName => $val));
}
This is actually a little misleading, especially because the method's PHPDoc says "Note that columnMap isn't used." Nevertheless, the priority will be given to a value in the columnMap
property being ->myMethodName
. In both the documentation you linked to and the CustomLoader
test implementation in the framework's unit tests, they both use this syntax to specifically target the handler for that column:
$loader->columnMap = array(
'FirstName' => '->importFirstName',
In this case, $obj
is the DataObject that you're going to update (e.g. a Member
).
If you don't do that, you can define importFirstName
on the DataObject that's being imported, and the elseif
in the code above will then call that function. In that case the $obj
is not provided because you can use $this
instead.
"Is it the final import object" - yes. It gets written after the loop that code is in:
// write record
if (!$preview) {
$obj->write();
}
Your custom functions would be required to set the data to the $obj
(or $this
if using importFieldName
style) but not to write it.
$val
seems to be the value of the column in the csv being imported. Is that correct?
Yes, after any formatting has been applied.
What is contained in
$record
?
It's the source row for the record in the CSV after formatting callbacks have been run on it, provided for context.
I hope this helps and that you can achieve what you want to achieve! This part of the framework probably hasn't had a lot of love in recent times, so please feel free to make a pull request to improve it in any way, even if it's only documentation updates! Good luck.
Upvotes: 3