During the insert into a destination table, any error that occurs is redirected to Errors table where we can see the ErrorCode
and ErrorColumn
. The problem is that we got a value in ErrorColumn
which does not exist anywhere within the package. Namely, there is not a single column that has LineageID
that is equal to ErrorColumn
.
Later, while enabling NULL entry in every single column, one by one, I found which column caused the problem. When I analyzed the column inside of a Data Flow task it did not have the LineageID
that was reported in an ErrorColumn
. For example, the ErrorColumn
reported was 413, but LineageID
is 84 during the first merge, and it changes during various sortings. Regardless of that, it never becomes 413. This ErrorColumn
ID (413) does not exist at all yet it is reported during the error redirection insert into destination source (Error table).
I've checked many sites, but they all suggested enumerating through ComponenteMetaData.InputCollection
or ComponentMetaData.OutputCollection
in a script task, and then enumerating through columns in order to find LineageID
, but it was without any success.
As I said, I've solved the problem, but as we are in the early stage of ETL process, this might happen in some other cases. How can this problem be tackled?
I'm copying my answer so we can get an authoritative Q&A on the site
What is the simple way to find the column name from Lineageid in SSIS
I remember saying this can't be that hard, I can write some script in the error redirect to lookup the column name from the input collection.
string badColumn = this.ComponentMetaData.InputCollection[Row.ErrorColumn].Name;
What I learned was the failing column isn't in that collection. Well, it is but the ErrorColumn reported is not quite what I needed. I couldn't find that package but here's an example of why I couldn't get what I needed. Hopefully you will have better luck.
This is a simple data flow that will generate an error once it hits the derived column due to division by zero. The Derived column generates a new output column (LookAtMe) as the result of the division. The data viewer on the Error Output tells me the failing column is 73. Using the above script logic, if I attempted to access column 73 in the input collection, it's going to fail because that is not in the collection. LineageID 73 is LookAtMe and LookAtMe is not in my error branch, it's only in the non-error branch.
This is a copy of my XML and you can see, yes, the outputColumn id 73 is LookAtme.
<outputColumn id="73" name="LookAtMe" description="" lineageId="73" precision="0" scale="0" length="0" dataType="i4" codePage="0" sortKeyPosition="0" comparisonFlags="0" specialFlags="0" errorOrTruncationOperation="Computation" errorRowDisposition="RedirectRow" truncationRowDisposition="RedirectRow" externalMetadataColumnId="0" mappedColumnId="0"><properties>
I really wanted that data though and I'm clever so I can union all my results back together and then conditional split it back out to get that. The problem is, Union All is an asynchronous transformation. Async transformations result in the data being copied from one set of butters to another resulting in...new lineage ids being assigned so even with a union all bringing the two streams back together, you wouldn't be able to call up the data flow chain to find that original lineage id because it's in a different buffer.
Around this point, I conceded defeat and decided I could live without intelligent/helpful error reporting in my packages.
The 2012 release of SSIS changed how they used LineageID to keep columns in sync. Instead of components mapping a number from a source to sink, they went with a textual representation of the element. The XML referenced above would now look like
<outputColumn
refId="Package\DFT Generate Errors\DER Divide by SomeNumber.Outputs[Derived Column Output].Columns[LookAtMe]"
dataType="i4"
errorOrTruncationOperation="Computation"
errorRowDisposition="RedirectRow"
lineageId="Package\DFT Generate Errors\DER Divide by SomeNumber.Outputs[Derived Column Output].Columns[LookAtMe]"
name="LookAtMe"
truncationRowDisposition="FailComponent">
If you look at the ErrorColumn now, they don't even reference the textual lineageid. Instead, they reference column 6. If I search through the source XML, I'm not going to find a reference to Column 6 anywhere. It must be some run-time magic.
The net result unfortunately is the same - you cannot access the error column because it's being created in this component because it only exists in the Output columns collection. It isn't available in the Error Columns collection.