Discussion:
incremental updates
(too old to reply)
Gustavo Frederico
2007-11-05 15:00:24 UTC
Permalink
Hi, I'm having problems with incremental updates. I have a mining structure and a mining model. The mining model uses the MS Association Rules algorithm. The mining structure has a data source, which maps to a table containing the (initial) training instances/cases.
The problem happens when I try to programmatically insert into the mining model:

"Error (Data mining): The mining structure , Blah blah blah is already trained and does not support incremental updates. Before using the INSERT INTO statement, use DELETE FROM <object>."

But I don't want to retrain everything. I don't understand why I would have to delete anything. I'm just looking for any way to incrementally train the model. I would prefer to provide constant data values programmatically for the incremental (mini-)training (with DMX's INSERT INTO) and not have to rely on inserting new cases to the data source table.

I verified the AllowsIncrementalInsert property of the mining service with this code:

foreach(MiningService ms in conn.MiningServices) {
Console.WriteLine("");
Console.WriteLine(ms.DisplayName);
Console.WriteLine(ms.AllowsIncrementalInsert);
}

and AllowsIncrementalInsert is true for Microsoft Association Rules.

So association rules are supposed to support incremental updates. So the main question is how to programmatically perform incremental updates?

Another question on the subject: does the mining model necessarily need a data source? Why can't I just have a structure and progressively train it providing constant data?

( I'm using Katmai CTP. )

Thanks,
Gustavo Frederico
Bogdan Crivat [MSFT]
2007-11-06 16:40:12 UTC
Permalink
Hello, Gustavo

Incremental model training is not supported by Analysis Services at the
moment. The fact that Association Rules (and, I just noticed, a few other
algorithms) report they support incremental training is a bug, which will be
fixed as soon as possible. Thanks for reporting this!

Currently, the only way to retrain your model is to pass a datasource
containing the new data.

If you need this in the context of the application that you mentioned in a
previous post, here is a possible workaround:
- mark your model as supporting drillthrough
- whenever you need to retrain:
- execute select * from Model.cases (to get the old training set)
- (if memory becomes an issue, serialize the cases to a temporary file,
possibly in the XML Dataset format)
- append the old training set with your new data
- execute a DELETE FROM followed by an INSERT INTO
Note that the parameterized INSERT INTO statement does not require a
DataTable, it works with an IDataReader implementation as well (this might
help if dealing with data cached externally)

--
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Please do not send email directly to this alias. It is for newsgroup
purposes only.

Regards,
bogdan
Post by Gustavo Frederico
Hi, I'm having problems with incremental updates. I have a mining
structure and a mining model. The mining model uses the MS Association
Rules algorithm. The mining structure has a data source, which maps to a
table containing the (initial) training instances/cases.
"Error (Data mining): The mining structure , Blah blah blah is already
trained and does not support incremental updates. Before using the INSERT
INTO statement, use DELETE FROM <object>."
But I don't want to retrain everything. I don't understand why I would
have to delete anything. I'm just looking for any way to incrementally
train the model. I would prefer to provide constant data values
programmatically for the incremental (mini-)training (with DMX's INSERT
INTO) and not have to rely on inserting new cases to the data source
table.
foreach(MiningService ms in conn.MiningServices) {
Console.WriteLine("");
Console.WriteLine(ms.DisplayName);
Console.WriteLine(ms.AllowsIncrementalInsert);
}
and AllowsIncrementalInsert is true for Microsoft Association Rules.
So association rules are supposed to support incremental updates. So the
main question is how to programmatically perform incremental updates?
Another question on the subject: does the mining model necessarily need a
data source? Why can't I just have a structure and progressively train it
providing constant data?
( I'm using Katmai CTP. )
Thanks,
Gustavo Frederico
Gustavo Frederico
2007-11-16 16:20:21 UTC
Permalink
Thanks for the reply.
Incremental training would be interested for on-line learning scenarios. But that's ok.
Following the retraining steps below is there any downtime in the model usage? Is there to execute these steps in a transactional context and would that avoid downtime?

thanks - Gustavo
Post by Bogdan Crivat [MSFT]
Hello, Gustavo
Incremental model training is not supported by Analysis Services at the
moment. The fact that Association Rules (and, I just noticed, a few other
algorithms) report they support incremental training is a bug, which will be
fixed as soon as possible. Thanks for reporting this!
Currently, the only way to retrain your model is to pass a datasource
containing the new data.
If you need this in the context of the application that you mentioned in a
- mark your model as supporting drillthrough
- execute select * from Model.cases (to get the old training set)
- (if memory becomes an issue, serialize the cases to a temporary file,
possibly in the XML Dataset format)
- append the old training set with your new data
- execute a DELETE FROM followed by an INSERT INTO
Note that the parameterized INSERT INTO statement does not require a
DataTable, it works with an IDataReader implementation as well (this might
help if dealing with data cached externally)
Bogdan Crivat [MSFT]
2007-11-16 16:57:56 UTC
Permalink
There is a downtime, between DELETE FROM and completion of INSERT INTO.
There may be two solutions:
1. using DMX
- use Adomd.Net AdomdConnection's BeginTransaction / Commit to wrap your
DMX method calls

2. with a staging table
- create your mining structure /model based on a relational table
- whenever you have new data in the application, just append to that
relational table
- whenever you need to reprocess using the new data, call ProcessFull on
the mining structure (using the XML DDL syntax, because DMX requires a
DELETE first)
--
--
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Please do not send email directly to this alias. It is for newsgroup
purposes only.

thanks,
bogdan
Post by Gustavo Frederico
Thanks for the reply.
Incremental training would be interested for on-line learning scenarios. But that's ok.
Following the retraining steps below is there any downtime in the model
usage? Is there to execute these steps in a transactional context and
would that avoid downtime?
thanks - Gustavo
Post by Bogdan Crivat [MSFT]
Hello, Gustavo
Incremental model training is not supported by Analysis Services at the
moment. The fact that Association Rules (and, I just noticed, a few other
algorithms) report they support incremental training is a bug, which will be
fixed as soon as possible. Thanks for reporting this!
Currently, the only way to retrain your model is to pass a datasource
containing the new data.
If you need this in the context of the application that you mentioned in a
- mark your model as supporting drillthrough
- execute select * from Model.cases (to get the old training set)
- (if memory becomes an issue, serialize the cases to a temporary file,
possibly in the XML Dataset format)
- append the old training set with your new data
- execute a DELETE FROM followed by an INSERT INTO
Note that the parameterized INSERT INTO statement does not require a
DataTable, it works with an IDataReader implementation as well (this might
help if dealing with data cached externally)
Continue reading on narkive:
Loading...