Archive

Archive for May, 2010

Data Manipulation Part 1 : SQL

dataminingData manipulation is a big part of a data mining process. Some authors claims it could take 80% of a data mining project. I could only agree. If data comes from the data warehouse it could be a lot faster. If you have to dig (and understand) operational systems or  adding some externals data the works takes even more time. Therefore it is of greatest importance to be efficient in data manipulation. Currently I use two way to do this task : big SQL queries or ETL depending on the situation.

Read more…