finding individual cases that make up a decision tree node histogram

Discussion:

(too old to reply)

Jesse

2008-03-21 22:52:46 UTC

I am using the MS Decision Trees algorithm in analysis server 2005.

Is it possible to inspect the individual cases that make up a
particular node in a decision tree? Can I write a DMX query to get at
them?

Thanks,
Jesse

Dejan Sarka

2008-03-22 07:48:43 UTC

Permalink

Post by Jesse
Is it possible to inspect the individual cases that make up a
particular node in a decision tree? Can I write a DMX query to get at
them?

Yes, if you enabled Drillthrough in your model. Check the SELECT FROM
<model>.CASES DMX statement.

--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx

Jesse

2008-03-22 20:41:28 UTC

Permalink

On Mar 22, 12:48 am, "Dejan Sarka"

Post by Dejan Sarka

Post by Jesse
Is it possible to inspect the individual cases that make up a
particular node in a decision tree? Can I write a DMX query to get at
them?

Yes, if you enabled Drillthrough in your model. Check the SELECT FROM
<model>.CASES DMX statement.
--
Dejan Sarkahttp://blogs.solidq.com/EN/dsarka/default.aspx

I didn't have drillthrough enabled when the model was built. I guess
this means I have to re-process the model to get at the cases?

Jesse

2008-03-22 20:50:02 UTC

Permalink

One other thing: can you give me an idea of the impact of specifying
WITH DRILLTHROUGH on model build time will be? for example, if I have
a model that took 6 hours to process with drillthrough turned off, how
long can I expect it to run with drillthrough enabled? Just an
estimate is all I'm looking for.

Dejan Sarka

2008-03-24 14:59:28 UTC

Permalink

Post by Jesse
One other thing: can you give me an idea of the impact of specifying
WITH DRILLTHROUGH on model build time will be? for example, if I have
a model that took 6 hours to process with drillthrough turned off, how
long can I expect it to run with drillthrough enabled? Just an
estimate is all I'm looking for.

Jesse, I think enabling drillthrough is just a flag in the model meta data;
I do not think processing should take longer. However, just to be sure, you
might want to reprocess just a subset of data and compare times with or
without drillthrough.

--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx

Bogdan Crivat [MSFT]

2008-03-24 18:03:20 UTC

Permalink

Processing with DRILLTHROUGH enabled will typically take just a few seconds
more than without the flag. The algorithm will make an extra pass over the
training data and save the associations between training cases and tree
nodes.

An alternate way of getting the cases (if Drillthrough is not enabled):

SELECT T.* FROM MyModel NATURAL PREDICTION JOIN
OPENQUERY(...., <Original Training Data>) AS T
WHERE PredictNodeId() = '<Node_Unique_Name_for_your_target_node>'
--
--
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Please do not send email directly to this alias. It is for newsgroup
purposes only.

thanks,
bogdan

Post by Dejan Sarka

Jesse, I think enabling drillthrough is just a flag in the model meta
data; I do not think processing should take longer. However, just to be
sure, you might want to reprocess just a subset of data and compare times
with or without drillthrough.
--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx

Jesse

2008-03-29 22:30:53 UTC

Permalink

On Mar 24, 11:03 am, "Bogdan Crivat [MSFT]"

Post by Bogdan Crivat [MSFT]
Processing with DRILLTHROUGH enabled will typically take just a few seconds
more than without the flag. The algorithm will make an extra pass over the
training data and save the associations between training cases and tree
nodes.
SELECT T.* FROM MyModel NATURAL PREDICTION JOIN
OPENQUERY(...., <Original Training Data>) AS T
WHERE PredictNodeId() = '<Node_Unique_Name_for_your_target_node>'

this doesn't work as PredictNodeId() requires an argument, needs to be
a scalar column reference.

which column reference should I pass? or perhaps you were thinking of
a different function?