Discussion:
finding individual cases that make up a decision tree node histogram
(too old to reply)
Jesse
2008-03-21 22:52:46 UTC
Permalink
I am using the MS Decision Trees algorithm in analysis server 2005.

Is it possible to inspect the individual cases that make up a
particular node in a decision tree? Can I write a DMX query to get at
them?

Thanks,
Jesse
Dejan Sarka
2008-03-22 07:48:43 UTC
Permalink
Post by Jesse
Is it possible to inspect the individual cases that make up a
particular node in a decision tree? Can I write a DMX query to get at
them?
Yes, if you enabled Drillthrough in your model. Check the SELECT FROM
<model>.CASES DMX statement.
--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx
Jesse
2008-03-22 20:41:28 UTC
Permalink
On Mar 22, 12:48 am, "Dejan Sarka"
Post by Dejan Sarka
Post by Jesse
Is it possible to inspect the individual cases that make up a
particular node in a decision tree? Can I write a DMX query to get at
them?
Yes, if you enabled Drillthrough in your model. Check the SELECT FROM
<model>.CASES DMX statement.
--
Dejan Sarkahttp://blogs.solidq.com/EN/dsarka/default.aspx
I didn't have drillthrough enabled when the model was built. I guess
this means I have to re-process the model to get at the cases?
Jesse
2008-03-22 20:50:02 UTC
Permalink
One other thing: can you give me an idea of the impact of specifying
WITH DRILLTHROUGH on model build time will be? for example, if I have
a model that took 6 hours to process with drillthrough turned off, how
long can I expect it to run with drillthrough enabled? Just an
estimate is all I'm looking for.
Dejan Sarka
2008-03-24 14:59:28 UTC
Permalink
Post by Jesse
One other thing: can you give me an idea of the impact of specifying
WITH DRILLTHROUGH on model build time will be? for example, if I have
a model that took 6 hours to process with drillthrough turned off, how
long can I expect it to run with drillthrough enabled? Just an
estimate is all I'm looking for.
Jesse, I think enabling drillthrough is just a flag in the model meta data;
I do not think processing should take longer. However, just to be sure, you
might want to reprocess just a subset of data and compare times with or
without drillthrough.
--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx
Bogdan Crivat [MSFT]
2008-03-24 18:03:20 UTC
Permalink
Processing with DRILLTHROUGH enabled will typically take just a few seconds
more than without the flag. The algorithm will make an extra pass over the
training data and save the associations between training cases and tree
nodes.

An alternate way of getting the cases (if Drillthrough is not enabled):

SELECT T.* FROM MyModel NATURAL PREDICTION JOIN
OPENQUERY(...., <Original Training Data>) AS T
WHERE PredictNodeId() = '<Node_Unique_Name_for_your_target_node>'
--
--
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Please do not send email directly to this alias. It is for newsgroup
purposes only.

thanks,
bogdan
Post by Dejan Sarka
Post by Jesse
One other thing: can you give me an idea of the impact of specifying
WITH DRILLTHROUGH on model build time will be? for example, if I have
a model that took 6 hours to process with drillthrough turned off, how
long can I expect it to run with drillthrough enabled? Just an
estimate is all I'm looking for.
Jesse, I think enabling drillthrough is just a flag in the model meta
data; I do not think processing should take longer. However, just to be
sure, you might want to reprocess just a subset of data and compare times
with or without drillthrough.
--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx
Jesse
2008-03-29 22:30:53 UTC
Permalink
On Mar 24, 11:03 am, "Bogdan Crivat [MSFT]"
Post by Bogdan Crivat [MSFT]
Processing with DRILLTHROUGH enabled will typically take just a few seconds
more than without the flag. The algorithm will make an extra pass over the
training data and save the associations between training cases and tree
nodes.
SELECT T.* FROM MyModel NATURAL PREDICTION JOIN
OPENQUERY(...., <Original Training Data>) AS T
WHERE PredictNodeId() = '<Node_Unique_Name_for_your_target_node>'
this doesn't work as PredictNodeId() requires an argument, needs to be
a scalar column reference.

which column reference should I pass? or perhaps you were thinking of
a different function?

Loading...